-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monotonic check of procfs fails in WSL2 #1092
Comments
That is interesting as this only checks if the timestamps are in the right oder. Why would the procfs metric provider not report the right time in the right order? Could you please only run the procfs metric provider with the command that is shown when you run the GMT. Could you also please send us the procfs metrics provider output file in |
Running the command stdbuf -o0 taskset -c 0 /home/david/green-metrics-tool/metric_providers/cpu/utilization/procfs/system/metric-provider-binary -i 99 > /tmp/green-metrics-tool/cpu_utilization_procfs_system.log gives me the following result (excerpt):
You can see that there is a jump into the past from 1741703676780350 to 1741703674264049. With WSL2 I don't want to (and can't) do energy measurements. But it can be nice to see some metrics when doing test runs. So for me it would be sufficient to be able to disable the monotonic check. |
So after getting the GMT installed on WSL 🤢 I could reproduce the error. The problem is that @davidkopp can you please try the patch and see if that fixes your issue? @ArneTR we should discuss if we want to change this everywhere [1]. As the system time could change for long runs. Very unlikely but possible. Remember the leap second discussion :) We could also just warn if the user uses a provider under windows with anything less than 500ms. That should be enough to get monotonic times even under WSL. As you can't really use the data anyhow that could be the easiest. (Could be a simple check routine) [0] microsoft/WSL#77 |
The patch fixes the issue for The same issue happens also for the Cgroup providers, like |
Luckily it is the clock source and not the writing to the file that makes the output non monotonic 😅 Looking at the linked source under [0] it seems like it got way better in WSL2. Nonetheless I guess it makes total sense to move to the monotonic clock. However @ribalba what I read is that gettimeofday is actually more performant than clock_monotonic The best way would be to use the TSC, which however only works if it is invariant. I will make some checks and look further into this. |
I have done some digging here.
Intermediate Summary
Perfomance considerations for applying Didis PatchNo PatchPerformance counter stats for './metric-provider-binary -i 100' (10 runs):
Performance counter stats for './metric-provider-binary -i 10' (10 runs):
With PatchPerformance counter stats for './metric-provider-binary -i 100' (10 runs):
Performance counter stats for './metric-provider-binary -i 10' (10 runs):
SummaryI can barely see any differences between the two implementations. It would be more performant to make the re-keying in Python but in this case I argue for not bringing in the complexity and keeping the re-keying in C @ribalba: Can you make a PR here changing this in all providers please? |
Also: Please use CLOCK_MONOTONIC_RAW as this has no NTP adjustments (according to ChatGPT. Please double check with docs) |
The check, if the metrics of procfs are monotonic increasing fails in my environment every time:
I guess I can blame WSL2 for that. For testing usage scenarios I'm using Windows 11 at the moment with WSL2. My CPU is AMD Ryzen 7 PRO 7840U.
There are 3 workarounds:
--dev-no-metrics
green-metrics-tool/metric_providers/base.py
Line 166 in 5504a9a
I'm thinking about if it makes sense to add another switch to runner.py (e.g.
--skip-metric-checks
) that allows to disable the metric checks. Not sure if it's worth it.The text was updated successfully, but these errors were encountered: