-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add to_ascii_upper, to_ascii_lower and eq_ignore_ascii_case in std::ascii #8231
Conversation
On first glance this seems fairly specialized to Servo's use cases - CSS and HTML. I'd be curious to know if there are others. We need to be cautious about adding features to both |
Also, we already have See also #5822 |
I use http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#ascii-case-insensitive It is gonna be needed in a bunch of different libraries. The options are:
I’d really prefer to avoid all but the last of these options. Would it be better to have these in the |
If they go in std, than placing them into std::ascii would probably the best. |
I'm fine with it being implemented in |
Updated pull request: have functions in
Maybe I should not worry about it and replace it with |
Could you unify the UPPER_MAP and LOWER_MAP case folding with the case folding for Both seem to do the same thing, so having two wildly different implementations in the same module seems kinda unnecessary. (As an aside: is there a specific reason why you need a lookup table for ascii here? Isn't case in the ascii range just defined by a single bit?) Same thing with function naming: Consistency would be nice here too. Lastly, it would be nice if you could methodify the new functions where applicable (add a |
About naming/unification: The Still, please indicate the naming / kind of interface that you prefer and I can update this pull request. About implementation: Yes, case in the ASCII range is a single bit. These function can be implemented in at least two ways: with a lookup in a 256-byte vector, or with a bit flip conditioned by two comparisons. I don’t really know which is faster or otherwise better, but I went with the former as it seems to be what CPython and my system’s libc are doing. I could change |
Sure, not talking about converting it to All the No idea which one is faster, so maybe just use the bitflip one because it is shorter? |
Another way to implement this is with a match on a range pattern (leaving to the compiler how to implement that.) A micro-benchmark show that the lookup table is 1.2x ~ 2.2x faster than either other solution: https://gist.github.com/SimonSapin/6156068#file-summary Upadated PR: change |
On noes! The build failed because of |
Original pull request: Add str.to_ascii_lower() and str.to_ascii_upper() methods in std::str.
Fix `implicit_clone` for `&&T` fixes rust-lang#8227 changelog: Don't lint `implicit_clone` on `&&T`
Original pull request: Add str.to_ascii_lower() and str.to_ascii_upper() methods in std::str.