Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache sizes #439

Closed
mborland opened this issue Sep 28, 2020 · 6 comments
Closed

Cache sizes #439

mborland opened this issue Sep 28, 2020 · 6 comments

Comments

@mborland
Copy link
Member

In the discussion on the prime sieve PR (#400) we have run into the issue several times of not having a cross-platform way of querying the size of the L1 cache. This repo should solve that issue. Is there an existing library in boost that this should go into as a PR? I think it would be useful outside of math, but it is not clear to me where it should go.

@NAThompson
Copy link
Collaborator

I suspect, but do not know, that your syscalls will compile down to a cpuid asm instruction. This is one of the slowest instruction on a modern CPU, so it might not pay off to determine the cache size.

Could you godbolt + benchmark your cachesize code?

@mborland
Copy link
Member Author

sysconf definitely compiles down to cpuid. I added the benchmark (generally copied from constants_performance), and it takes 143ns to run on my machine. My thought was having a default L1D value of 32768, and exposing an interface to change that value.

@jzmaddock
Copy link
Collaborator

Presumably this would only be called once during program initialization anyway - in which case the cost would be negligible?

As for where it belongs.... I have no idea!

@NAThompson
Copy link
Collaborator

NAThompson commented Sep 28, 2020

What about boost::predef?

I've always thought that the functionality of cpuid should be exposed somewhere in boost.

@mborland
Copy link
Member Author

@jzmaddock With the user being responsible for memory allocation with the OI approach this could just be a extra step in initialization. Querying the L1D size should be trivial in comparison to a cache miss/under-utilization.

@NAThompson I will open an issue with them to see what they say.

mborland added a commit to mborland/math that referenced this issue Oct 1, 2020
Sets default L1D value but offers interface for user to change
@mborland mborland closed this as completed Oct 1, 2020
@ckormanyos
Copy link
Member

ckormanyos commented Oct 2, 2020

...a cpuid asm instruction. This is one of the slowest...

On the architecture being discussed, the cpuid instruction is one of several instructions with the special characteristic of being a barrier instruction, also known as a serializing instruction. In addition to returning the CPU-ID information in the specified registers, this instruction will block and clear the instruction pipeline. Another such instruction is rdtsc on that core.

These instructions are great if you are writing a real-time operating system or similar stuff on the metal. They will, however, hit the performance with the branch and pipeline clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants