-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add module for validating ASCII characters and upper/lower conversion #32
Conversation
Would you mind rebasing on top of the latest master to pick up the CI tests? |
src/tests/ascii/test_ascii.f90
Outdated
program test_ascii | ||
|
||
use stdlib_experimental_error, only: assert | ||
use stdlib_experimental_ascii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explicitly import the symbols that are being used? I think we should follow that approach of "explicit imports", as in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The reason I noticed this is that I came here to see what the public API actually is. As it is a bit hard to immediately see from the module itself, because there is not a single public
line, but rather quite a few symbols are decorated with public
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have private
declared at the top of all modules, and below one or more lists of public :: ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zbeekman, I agree, it makes it very easy to see what the public symbols are.
So I think this looks really nice. The public API seems nicely done, the naming convention I think is good. Will the The I think using |
Python also has an isupper method. Should we also implement |
General question: should we maintain I think getting the unicode working (as in Python) will be some work, so we can start with Things like
You can see the |
I think we should have separate modules for ASCII and Unicode characters. In fact using only the intrinsic Fortran character functions (achar and iachar) it is not possible to find say uppercase Slavic letters č, š, ž... I think it will be necessary to interface with C to achieve Unicode support. A second issue is that some preprocessing will be necessary, as not all Fortran compilers support the extended Unicode character set. The current ascii module already contains I will rebase and import explicitly the public functions for the test driver asap. |
@ivan-pi yes, if it is not possible to merge unicode with ascii, then we need two modules. |
The other thing is --- since you used https://github.com/dlang/phobos/blob/434429f273d0359744b6d3ba9db36d3bef1c7593/std/ascii.d as the original, we have to cite their license. Overall this looks good to me. It would be nice to get some more reviews on this before we merge. @marshallward, @jacobwilliams, @milancurcic do you have any feedback on the API here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks @ivan-pi!
Interesting format for comments (some documentation generator?) but fine with me.
Good to merge IMO.
@milancurcic thanks for the review. @ivan-pi would you mind updating |
since I've been pointed here, this project might be interesting: https://github.com/lemire/fastvalidate-utf-8 althought I don't see how that could be implemented in Fortran given missing inline assembly. |
@dev-zero thanks! I made a comment in #11 (comment). |
This PR addresses #11 (there are a few open question left there).
Tested with both the gfortran and Intel Fortran compilers for the 'default' (ascii) character set.
The tests are essentially a port of those at https://github.com/dlang/phobos/blob/master/std/ascii.d (hopefully not a licensing issue?).