-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bitset count performance #724
Comments
fortran intrinsic function |
Looks like, but then why is it not used in |
If I remember well, However, I agree that @PierUgit 's solution or using @PierUgit , do you still have such an implementation (i recognize that my answer is quite late!)? If so, would you like to open a PR with your solution/optimization? |
@jvdp1 I don't have a code ready in Github, I did some tests independently on a code of mine... To be honest I have first to learn how to use fpm and build the stdlib before being able to submit PRs here... |
Is it ok for you if I submit a PR and ask you for reviewing/testing it?
I usually use CMake for building stdlib. However, I agree that the doc is not clear enough for building |
It was actually much easier than expected! I could build stdlib with fpm, modify the code, and add a test case (this routine was not tested). I will submit a PR soon. I did it from the stdlib-fpm branch, though, and it looks quite behind the master branch... |
Actually development should be always from the master branch. The branch |
Probably I should close the PR and restart from the master branch... |
Ok. Indeed, probably better and easier. |
Regarding the build with fpm, what I do and want to include as info in the readme is the following: source ./ci/fpm-deployment.sh
cd stdlib-fpm/
fpm test --profile release The fpm-deployment.sh scrit takes care of processing the files and creating the stdlib-fpm which contains only .f90 BTW, in my PR regarding str2num I included a change to this script as we realized that the .fypp files in the test folder were not processed. |
In the bitsets module, the
bit_count_large()
function is scanning bit by bit withbtest()
in the main loop:Using chunks of blocks and the fact that
btest()
is elemental, one can get significant performance improvement (on a similar implementation I obtain a 5x speed-up witch a chunk size of 1024):(edit) of course, this makes sense only for a large number of bits
The text was updated successfully, but these errors were encountered: