Skip to content

Commit d7a4e91

Browse files
committed
Update grammar to allow e-suffixes
This is aiming to match the behaviour of #131656 - Integer_literal: use SUFFIX rather than SUFFIX_NO_E - FLOAT_BODY_WITHOUT_EXPONENT: remove eE not-predicate before suffix - reserved_float_empty_exponent: - no longer apply if sign is absent - move before float_body_without_exponent in priority order - reserved_float_based: - no longer apply if both sign and exponent are absent
1 parent 7f21f74 commit d7a4e91

File tree

4 files changed

+35
-22
lines changed

4 files changed

+35
-22
lines changed

src/lex_via_peg/pretokenisation/pest_pretokeniser.rs

+4-2
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ fn interpret_pest_pair(pair: Pair<Rule>) -> Result<PretokenData, &'static str> {
173173
}
174174
Rule::Unterminated_literal_2015 | Rule::Reserved_literal_2021 => Ok(PretokenData::Reserved),
175175
Rule::Reserved_guard_2024 => Ok(PretokenData::Reserved),
176-
Rule::Float_literal => {
176+
Rule::Float_literal_1 | Rule::Float_literal_2 => {
177177
let mut body = None;
178178
let mut suffix = None;
179179
for sub in pair.into_inner().flatten() {
@@ -192,7 +192,9 @@ fn interpret_pest_pair(pair: Pair<Rule>) -> Result<PretokenData, &'static str> {
192192
suffix: suffix.map(Into::into),
193193
})
194194
}
195-
Rule::Reserved_float => Ok(PretokenData::Reserved),
195+
Rule::Reserved_float_empty_exponent | Rule::Reserved_float_based => {
196+
Ok(PretokenData::Reserved)
197+
}
196198
Rule::Integer_literal => {
197199
let mut base = None;
198200
let mut digits = None;

src/lex_via_peg/pretokenisation/pretokenise.pest

+22-20
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,10 @@ PRETOKEN_2015 = {
88
Double_quoted_literal_2015 |
99
Raw_double_quoted_literal_2015 |
1010
Unterminated_literal_2015 |
11-
Float_literal |
12-
Reserved_float |
11+
Float_literal_1 |
12+
Reserved_float_empty_exponent |
13+
Float_literal_2 |
14+
Reserved_float_based |
1315
Integer_literal |
1416
Lifetime_or_label |
1517
Raw_identifier |
@@ -27,8 +29,10 @@ PRETOKEN_2021 = {
2729
Double_quoted_literal_2021 |
2830
Raw_double_quoted_literal_2021 |
2931
Reserved_literal_2021 |
30-
Float_literal |
31-
Reserved_float |
32+
Float_literal_1 |
33+
Reserved_float_empty_exponent |
34+
Float_literal_2 |
35+
Reserved_float_based |
3236
Integer_literal |
3337
Raw_lifetime_or_label_2021 |
3438
Reserved_lifetime_or_label_prefix_2021 |
@@ -49,8 +53,10 @@ PRETOKEN_2024 = {
4953
Raw_double_quoted_literal_2021 |
5054
Reserved_literal_2021 |
5155
Reserved_guard_2024 |
52-
Float_literal |
53-
Reserved_float |
56+
Float_literal_1 |
57+
Reserved_float_empty_exponent |
58+
Float_literal_2 |
59+
Reserved_float_based |
5460
Integer_literal |
5561
Raw_lifetime_or_label_2021 |
5662
Reserved_lifetime_or_label_prefix_2021 |
@@ -166,9 +172,11 @@ DECIMAL_PART = { '0'..'9' ~ DECIMAL_DIGITS }
166172

167173

168174
// ANCHOR: float_literal
169-
Float_literal = {
170-
FLOAT_BODY_WITH_EXPONENT ~ SUFFIX ? |
171-
FLOAT_BODY_WITHOUT_EXPONENT ~ !("e"|"E") ~ SUFFIX ? |
175+
Float_literal_1 = {
176+
FLOAT_BODY_WITH_EXPONENT ~ SUFFIX ?
177+
}
178+
Float_literal_2 = {
179+
FLOAT_BODY_WITHOUT_EXPONENT ~ SUFFIX ? |
172180
FLOAT_BODY_WITH_FINAL_DOT ~ !"." ~ !IDENT_START
173181
}
174182

@@ -188,40 +196,34 @@ FLOAT_BODY_WITH_FINAL_DOT = {
188196
// ANCHOR_END: float_literal
189197

190198
// ANCHOR: reserved_float
191-
Reserved_float = {
192-
RESERVED_FLOAT_EMPTY_EXPONENT | RESERVED_FLOAT_BASED
193-
}
194-
RESERVED_FLOAT_EMPTY_EXPONENT = {
199+
Reserved_float_empty_exponent = {
195200
DECIMAL_PART ~ ("." ~ DECIMAL_PART ) ? ~
196-
("e"|"E") ~ ("+"|"-") ?
201+
("e"|"E") ~ ("+"|"-")
197202
}
198-
RESERVED_FLOAT_BASED = {
203+
Reserved_float_based = {
199204
(
200205
("0b" | "0o") ~ LOW_BASE_PRETOKEN_DIGITS |
201206
"0x" ~ HEXADECIMAL_DIGITS
202207
) ~ (
203-
("e"|"E") |
208+
("e"|"E") ~ ("+"|"-" | EXPONENT_DIGITS) |
204209
"." ~ !"." ~ !IDENT_START
205210
)
206211
}
207212
// ANCHOR_END: reserved_float
208213

209-
210214
// ANCHOR: integer_literals
211215
Integer_literal = {
212216
( INTEGER_BINARY_LITERAL |
213217
INTEGER_OCTAL_LITERAL |
214218
INTEGER_HEXADECIMAL_LITERAL |
215219
INTEGER_DECIMAL_LITERAL ) ~
216-
SUFFIX_NO_E ?
220+
SUFFIX ?
217221
}
218222

219223
INTEGER_BINARY_LITERAL = { "0b" ~ LOW_BASE_PRETOKEN_DIGITS }
220224
INTEGER_OCTAL_LITERAL = { "0o" ~ LOW_BASE_PRETOKEN_DIGITS }
221225
INTEGER_HEXADECIMAL_LITERAL = { "0x" ~ HEXADECIMAL_DIGITS }
222226
INTEGER_DECIMAL_LITERAL = { DECIMAL_PART }
223-
224-
SUFFIX_NO_E = { !("e"|"E") ~ SUFFIX }
225227
// ANCHOR_END: integer_literals
226228

227229

writeup/numeric_literal_pretokens.md

+5
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@ The following nonterminals are common to the definitions below:
2727
> Note: The `! "."` subexpression makes sure that forms like `1..2` aren't treated as starting with a float.
2828
> The `! IDENT_START` subexpression makes sure that forms like `1.some_method()` aren't treated as starting with a float.
2929
30+
> Note: The `Reserved_float_empty_exponent` pretoken nonterminal is placed between `Float_literal_1` and `Float_literal_2` in priority order
31+
> (which is why there are two pretoken nonterminals producing `FloatLiteral`).
32+
3033

3134
#### Reserved float { .rule }
3235

@@ -41,6 +44,8 @@ The following nonterminals are common to the definitions below:
4144
##### Attributes
4245
(none)
4346

47+
> Note: The `Reserved_float_empty_exponent` pretoken nonterminal is placed between `Float_literal_1` and `Float_literal_2` in priority order.
48+
> This ordering makes sure that forms like `123.4e+` are reserved, rather than being accepted by `FLOAT_BODY_WITHOUT_EXPONENT`).
4449
4550
#### Integer literal { .rule }
4651

writeup/pretokenising.md

+4
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ It's also available on a [single page](complete_pretoken_grammar.md).
2727

2828
The pretoken nonterminals are presented in an order consistent with their appearance in the edition nonterminals.
2929
That means they appear in priority order (highest priority first).
30+
There is one exception, for floating-point literals and their related reserved forms (see [Float literal]).
3031

3132

3233
### Extracting pretokens
@@ -51,6 +52,7 @@ Each pretoken nonterminal produces a single kind of pretoken.
5152
In most cases a given kind of pretoken is produced only by a single pretoken nonterminal.
5253
The exceptions are:
5354
- Several pretoken nonterminals produce `Reserved` pretokens.
55+
- There are two pretoken nonterminals producing `FloatLiteral` pretokens.
5456
- In some cases there are variant pretoken nonterminals for different editions.
5557

5658
Each pretoken nonterminal (or group of edition variants) has a subsection on the following pages,
@@ -79,3 +81,5 @@ it's a bug in this specification.
7981
In other cases the attributes table entry defines the attribute value explicitly,
8082
depending on the characters consumed by the pretoken nonterminal or on which subexpression of the pretoken nonterminal matched.
8183

84+
85+
[Float literal]: numeric_literal_pretokens.md#float-literal

0 commit comments

Comments
 (0)