-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lex] Provide unicode name for all control characters #7404
base: main
Are you sure you want to change the base?
Conversation
Note that I also did a review of all usage of the word "space" in [lex] and [cpp]. Some uses were not relevant, such as "name space". Others applied to the contents of literals, which might not be in the basic source encoding. This PR is considered to completely address the concern for control characters throughout the standard. |
84bfaf1
to
e289bba
Compare
This looks reasonable and correct |
source/lex.tex
Outdated
@@ -477,7 +477,7 @@ | |||
characters \tcode{*/}. These comments do not nest. | |||
\indextext{comment!\tcode{//}}% | |||
The characters \tcode{//} start a comment, which terminates immediately before the | |||
next new-line character. If there is a form-feed or a vertical-tab | |||
next new-line character. If there is a \unicode{0009}{character tabulation} or a \unicode{000b}{line tabulation} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's wrong. Form feed is 0x0c .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks --- a foolish error well spotted.
source/lex.tex
Outdated
@@ -824,7 +824,7 @@ | |||
\end{footnote} | |||
operators, and other separators. | |||
\indextext{whitespace}% | |||
Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments | |||
Comments, \unicode{0020}{space}s, \unicode{0009}{character tabulation}s and \unicode{0009}{line tabulation}s, \unicode{000c}{form feed}s, and newlines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think plural works for the way we present Unicode chars. Please rephrase.
Also, we seem to use "new-line" elsewhere (e.g. near the changes above). Note the hyphen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically copied newlines from the original list, as I had seen both terms in the standard. Now the I search again, there are only 4 remaining, and half are in code comments. I will fix this one here, and file a separate PR to fix the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewording for plurals attempted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it turns out the remaining three uses of "newline" are all in library clauses, which outnumber the one use of "new-line" in the library, so I will defer that question for now. The fix for new-line above is now in this PR though.
This commit does not touch the new-line character as paper P2348. It resricts itself to consistent use of the unicode character name for space, horizontal tab, and vertical tab. Compared to PR cplusplus#7359 it deliberately does not touch the grammar that would necessitate a review by core review. The intent is to rebase that PR if this one lands.
e289bba
to
758aa5d
Compare
Responded to Jens's comments and force-pushed to avoid badness in the history. |
This commit does not touch the new-line character as paper P2348. It resricts itself to consistent use of the unicode character name for space, horizontal tab, and vertical tab. Compared to PR #7359 it deliberately does not touch the grammar that would necessitate a review by core review. The intent is to rebase that PR if this one lands.