Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lex] Provide unicode name for all control characters #7404

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AlisdairM
Copy link
Contributor

This commit does not touch the new-line character as paper P2348. It resricts itself to consistent use of the unicode character name for space, horizontal tab, and vertical tab. Compared to PR #7359 it deliberately does not touch the grammar that would necessitate a review by core review. The intent is to rebase that PR if this one lands.

@AlisdairM
Copy link
Contributor Author

Note that I also did a review of all usage of the word "space" in [lex] and [cpp]. Some uses were not relevant, such as "name space". Others applied to the contents of literals, which might not be in the basic source encoding. This PR is considered to completely address the concern for control characters throughout the standard.

@cor3ntin
Copy link
Contributor

This looks reasonable and correct

source/lex.tex Outdated
@@ -477,7 +477,7 @@
characters \tcode{*/}. These comments do not nest.
\indextext{comment!\tcode{//}}%
The characters \tcode{//} start a comment, which terminates immediately before the
next new-line character. If there is a form-feed or a vertical-tab
next new-line character. If there is a \unicode{0009}{character tabulation} or a \unicode{000b}{line tabulation}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's wrong. Form feed is 0x0c .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks --- a foolish error well spotted.

source/lex.tex Outdated
@@ -824,7 +824,7 @@
\end{footnote}
operators, and other separators.
\indextext{whitespace}%
Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments
Comments, \unicode{0020}{space}s, \unicode{0009}{character tabulation}s and \unicode{0009}{line tabulation}s, \unicode{000c}{form feed}s, and newlines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think plural works for the way we present Unicode chars. Please rephrase.

Also, we seem to use "new-line" elsewhere (e.g. near the changes above). Note the hyphen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically copied newlines from the original list, as I had seen both terms in the standard. Now the I search again, there are only 4 remaining, and half are in code comments. I will fix this one here, and file a separate PR to fix the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewording for plurals attempted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it turns out the remaining three uses of "newline" are all in library clauses, which outnumber the one use of "new-line" in the library, so I will defer that question for now. The fix for new-line above is now in this PR though.

This commit does not touch the new-line character as paper
P2348.  It resricts itself to consistent use of the unicode
character name for space, horizontal tab, and vertical tab.
Compared to PR cplusplus#7359 it deliberately does not touch the
grammar that would necessitate a review by core review.
The intent is to rebase that PR if this one lands.
@AlisdairM
Copy link
Contributor Author

Responded to Jens's comments and force-pushed to avoid badness in the history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants