-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: should traceTickDiv be different for different architectures? #10554
Comments
Yes, that can be a concern, because ordering imposed by timestamps must be consistent with causal ordering of events. For example, if a goroutine is started at the same tick it was created, it can cause inconsistent trace. |
According to PowerISA v2.07 section 6.2: The suggested frequency at which the time base increments is 512 MHz,
I guess we can first try 64 on x86, 16 on ppc, and 16 for other platforms |
CL https://golang.org/cl/9247 mentions this issue. |
My CL fixed the issue on POWER7/8, but the test still fails on PowerPC970 For example, on PowerMac G5 (970FX), the time base frequency is only 100/3 $ cat /proc/cpuinfo | grep timebase If you have access to POWER7/8 systems, could you please tell me what's the |
winton-01(~) % grep timebase /proc/cpuinfo On Sun, May 3, 2015 at 6:02 AM, Minux Ma [email protected] wrote:
|
Thanks Dave. That explains the problem. Perhaps we do need dynamic |
Can we get that from auxv ? On Sun, May 3, 2015 at 6:06 AM, Minux Ma [email protected] wrote:
|
(Github seems to have ignored my last email reply.) It seems the glibc function __ppc_get_timebase() parses the timebase frequency Here is the gdb dump of auxv (kernel 3.16.0)
|
After extensive stress testing on PowerMac G5
(2.7GHz PPC970FX) setting traceTickDiv = 4 is the
maximum value that can reliably pass all the pprof
tests.
One option is to use 16 for ppc64le (assuming that
all POWER cpus capable of running in little-endian
mode should have a 512MHz time base), and 4 for
ppc64.
This will make trace slightly larger for ppc64, but
will make the tests pass on more systems and don't
require making traceTickDiv a variable.
|
CL https://golang.org/cl/12579 mentions this issue. |
Nearly all the flaky failures we've seen in trace tests have been due to the use of time stamps to determine relative event ordering. This is tricky for many reasons, including: - different cores might not have exactly synchronized clocks - VMs are worse than real hardware - non-x86 chips have different timer resolution than x86 chips - on fast systems two events can end up with the same time stamp Stop trying to make time reliable. It's clearly not going to be for Go 1.5. Instead, record an explicit event sequence number for ordering. Using our own counter solves all of the above problems. The trace still contains time stamps, of course. The sequence number is just used for ordering. Should alleviate #10554 somewhat. Then tickDiv can be chosen to be a useful time unit instead of having to be exact for ordering. Separating ordering and time stamps lets the trace parser diagnose systems where the time stamp order and actual order do not match for one reason or another. This CL adds that check to the end of trace.Parse, after all other sequence order-based checking. If that error is found, we skip the test instead of failing it. Putting the check in trace.Parse means that cmd/trace will pick up the same check, refusing to display a trace where the time stamps do not match actual ordering. Using net/http's BenchmarkClientServerParallel4 on various CPU counts, not tracing vs tracing: name old time/op new time/op delta ClientServerParallel4 50.4µs ± 4% 80.2µs ± 4% +59.06% (p=0.000 n=10+10) ClientServerParallel4-2 33.1µs ± 7% 57.8µs ± 5% +74.53% (p=0.000 n=10+10) ClientServerParallel4-4 18.5µs ± 4% 32.6µs ± 3% +75.77% (p=0.000 n=10+10) ClientServerParallel4-6 12.9µs ± 5% 24.4µs ± 2% +89.33% (p=0.000 n=10+10) ClientServerParallel4-8 11.4µs ± 6% 21.0µs ± 3% +83.40% (p=0.000 n=10+10) ClientServerParallel4-12 14.4µs ± 4% 23.8µs ± 4% +65.67% (p=0.000 n=10+10) Fixes #10512. Change-Id: I173eecf8191e86feefd728a5aad25bf1bc094b12 Reviewed-on: https://go-review.googlesource.com/12579 Reviewed-by: Austin Clements <[email protected]>
It's documented that "Timestamps in trace are cputicks/traceTickDiv. “ and
that "64 is somewhat arbitrary (one tick is ~20ns on a 3GHz machine)."
That is ok for x86 cpus, where cputicks increment by 1 each cpu clock cycle,
but on ppc64x, the cputicks might be increasing only at bus frequency.
(For example, on PowerPC G5, it's increasing at about 300MHz), so if we're
still using traceTickDiv=64 there, then each trace tick is not ~20ns, but ~200ns.
Will that be a concern?
/cc @dvyukov
The text was updated successfully, but these errors were encountered: