-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add testing module to allow better structuring of test suites #494
Conversation
Thank you for starting this, I think it has been much needed for stdlib. |
We already have a sizeable test suite in stdlib, porting all tests will take a while. I'll keep this list here up-to-date with the state of this patch.
|
Thanks @awvwgk . This is highly needed, indeed! |
@awvwgk as an exercise and if you agree, I'd like to tackle all the tests for the io module. |
@milancurcic @jvdp1 thanks for the offer, feel free to send a PR against my fork, but please ping me there or I might overlook it. |
I will give it a start. @awvwgk : is it better to keep all stats functions in different tests? Or should I merge them in a single program? |
The testing framework should encourage writing smaller tests, such that one unit test is only testing one thing. For now I suggest the following strategy:
Don't spend too much time on step 3, we will refactor all driver programs at a later point to become more flexible. |
@awvwgk , an additional question: it seems that it is not possible to check if 2 (float) arrays are the same. should I use the boolean check? Or should I extend check_float to higher dimensions? |
Using a boolean check for now seems best. I prefer to only change one thing at a time to keep patches small. This one is already a huge patch and we still have to go through all the tests, therefore I would defer implementing new features in the testing framework at the moment. I was planning to transfer the test-drive project to @fortran-lang when it is used in stdlib and fpm. Than we can develop its capabilities further. Checks on arrays are a bit more difficult because there are different possible ways to define a threshold, therefore I would like to discuss this first before implementing something ad-hoc. |
The following tests can't be ported, because they are not thread-safe:
|
For the logger we can use the local logger instance, no? I don't know what to do about the other one, but it wouldn't be the end of the world if we omitted a test or two. |
It might be possible to rewrite the tests with a local logger instance. The main difficulty is to figure out whether a test depends on the final state generated by the previous test at the moment. This did cost me quite a while in the bitsets test, which saved a few lines of initialization every test by exploiting, that a previous test already initialized a global variable. |
I opened a PR with io tests here: awvwgk#87 (just in case there were no notifications, I'm new to this workflow). The tests there won't pass with ifort until #506 is merged. I think there are two ways forward:
|
I opened awvwgk#90 which addresses the failing ifort tests. |
That's fine, you ported so many already. |
I think there is some value in getting this PR merged at the current stage, since it will make the testing framework available, port the largest share of the test and allow us to port the rest of the tests without risking merge conflicts by making smaller PRs. From my side this PR is complete and ready for review. As described above, I have plans for following up once it is merged. I can update the PR before or after the review, that is up to you. But I don't want to do it twice, because it takes at least an hour of work to check all the commits when rebasing. |
I think the main change in this PR, to move to a unit testing framework, is fine. The main changes in a typical test seems minor and also quite minimal. So unless anybody has any objections, I think why don't we merge this. That is, let's bring this PR to merge cleanly and all tests pass. The rest of the tests do not have to be ported, they can just continue running the old way. I think both ways can coexist. And then we just port the tests as we have time once this PR is merged. |
I'm all for moving forward here. However, the list of things to keep in mind and fix has been growing week by week now. I think we also have a hard blocker, which requires some work first: #534, since many tests are now moved to using fypp. I can for sure dedicate an evening/night to update this branch. I'm currently just waiting for a clear signal here on what to do next. |
I was hoping to get all initial tests ported since we're so close, but this "bar" is arbitrary. I don't see a good reason not to merge this as is, pending resolved conflicts. @awvwgk let me know if you'd like me to help with resolving outstanding conflicts. |
Is the list in this comment up-to-date? If yes, I will try to find some times to move some of the remaining tests. Otherwise, we can always open an issue and assign it to me for later. |
@awvwgk I brought this branch up to date with master and resolved conflicts. There's an outstanding issue with fpm deployment that I need some advice on. The issue currently is that Lines 34 to 39 in d14fca8
so when the CI runs
I see a few solutions:
I think 3rd is the best approach but a little bit more work. What do you think? And if so, should we sort that out in this PR, or do a quick fix here (option 1) and then have a separate PR to switch to test-drive as an fpm dependency? |
Let's go with option 3, I opened fortran-lang/test-drive#6 to make test-drive aware of quadruple precision and started working on a solution in awvwgk#103. For using test-drive directly we have to:
|
We might have to increase the threshold for the sleep test on MacOS a bit further:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for me as it is currently.
I have no time now to move the remaining tests on stats. An issue might be assigned to me when merging this PR.
Many thanks to everybody who contributed to this project. To move forward, I'll go ahead and merge this PR now. |
For prior discussions on this topic see #162
This is an attempt to improve the test suite in stdlib. It uses a vendored version of test-drive, the testing framework I wrote for fpm, named
stdlib_test
for now. It currently has the status of a testing dependency, i.e. the module is available in stdlib's test suite but not part of stdlib's public API, downstream projects cannot use it.Why vendoring a dependency?
I'm not a fan of vendoring dependencies, but I found no better solution given the constraints of stdlib's build systems.
Why not vegetables?
I ported the tests for
stdlib_ascii
module by replacing thecheck
function ofstdlib_error
and added the test suite for the testing frame itself as well. The former uses the finalizer to check for failed tests while the latter is propagating the error back. Both will correctly report on failure, the latter approach can additionally recover and run the remaining tests as well.I'm opening this for feedback. Let me know what you think and whether we want to adopt this testing framework.
If we agree on this approach I will move all existing tests in the new format. In a follow-up patch I would try to bundle all tests into a single test executable and allow selection of the individual suites by the framework. This simplifies the logic for setting up the tests and it be easier to adopt fypp in the test suite without over-engineering the CMake macro / make include file to generate the tests.
Output from
stdlib_ascii
testerOutput from
stdlib_test
tester