-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[STDLIB_STATS] need to upgrade stdlib_stats
codes about compilation efficiency
#438
Comments
stdlib_stats
codesstdlib_stats
codes about compilation efficiency
Thank you for trying The aim of The API of the functions in Due to the "complexitiy" of the API of I don't think that a solution like the one proposed by muesli would be approriate for this, because the aim was to provide procedures for Fortran arrays (but I may be wrong; at least it is how I find Anyway, I agree that compilation of this part can be an issue, that could increase later with inroduction of new functions in |
Thanks, I understand. I don't know much about |
I think -DCMAKE_MAXIMUM_RANK should be 4 by default, that means stdlib will work by default for almost everybody. This is specially useful for new users who are not familiar with stdlib. |
This is indeed a good idea. I am for it. @awvwgk @milancurcic @ivan-pi what is your opinion about making |
I think, just because we can, doesn't mean we have to compile with full rank support, especially the stats modules get quite compilation intensive for no good reason. I'm usually compiling with 4 anyway, sometimes with 7 if I make system-wide installations, but I have yet to exceeded rank 4 in any actual application of stdlib. The CMake template for stdlib also reduces the max rank by default. A max rank of 4 sounds like more sensible default. I'm still looking forward to package stdlib, once we start putting a version on it, where a higher maximum rank than 4 might be much more relevant, because the end-user can't recompile if they depend on a binary distribution. |
Resolved by changing the default maximum rank in the CMake build files |
Overview: Compilation time is too long.
When compiling, I found that compiling

stdlib_stats
uses a lot of computer resources, especially RAM, which is related to the high-dimensional matrix dimensions defined instdlib_stats
, which greatly reduces the efficiency ofstdlib
and improves the overall compilation time ofstdlib
.It took my computer (CPU: intel i5 8250U) more than two hours to compile

stdlib
completely,When
RANK=15
, the compiled volume ofstdlib
reached 747MB.I took a quick look at the source code and thought that there might be a better way to replace the polymorphic interface with such a large number of multi-dimensional array arguments.
(see high-dimensional matrix dimensions)
(see RANK)
My understanding is: Rethink, need to be more flexible.
The length within a single dimension defined by Fortran can theoretically be infinitely expanded, but the number of dimensions needs to be manually defined by the user.
In the future, we will also build a large number of functions that use matrices. The current implementation of
stdlib_stats
is unreasonable, not adaptable and needs to be improved, (see stdlib_stats_moment.fypp).stdlib_stats
presets several basic dimensions to form a polymorphic interface, and sets multiple judgments (see condition judgments) on the number of processing dimensions, resulting in a decrease in compilation speed and an increase in compilation load.#281
#283
My solution is: Set up a matrix parser, or use a single-dimensional matrix algorithm.
If it is not for the communication within the different dimensions, we can achieve the effect by only setting the one-dimensional column vector, and hand the specific dimensional operation to the user to improve the versatility and flexibility of
stdlib
.Or we use the wiki solution in
stdlib
to set up a matrix parser and transform it when necessary to meet the polymorphic needs of multi-dimensional arrays.I have seen another library, and its solution is also good: muesli!
I don't know much more about
stdlib_stats
, so there may be limitations of my idea. However, I think the multi-dimensional array polymorphic interface instdlib_stats
needs to be improved.Hope to get the discussion, thank you all! 😍
The text was updated successfully, but these errors were encountered: