-
Notifications
You must be signed in to change notification settings - Fork 939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suspected live-lock issue in lxd.db under contention #2398
Comments
I was just looking up in the documentation for SQLite3, mostly for my own benefit, and wanted to quote the requisite paragraph here:
|
First-pass benchmarking script: # Required on my Vagrant box, for some reason
echo 'nameserver 8.8.8.8' | sudo tee -a /etc/resolv.conf
# Required because I'm a second class citizen (British…)
sudo locale-gen en_GB.UTF-8
# Stop nagging from apt-get
export DEBIAN_FRONTEND=noninteractive
sudo apt update --fix-missing
sudo apt install -y lxd zfsutils-linux parallel
# accept all defaults, unfortunately doesn't respect DEBIAN_FRONTEND variable
sudo lxd init
# Don't ask.......
echo "will cite" | parallel --bibtex
# Get Started
lxc remote add images images.linuxcontainers.org
for i in (1 5 10 100 1000); do
printf "## %d\n"
time seq 100 | parallel -j$i --eta lxc launch images:ubuntu/trusty/i386 ubuntu-32-{}
done As mentioned in the initial issue report, since I suspect the issue is parallelism and struggling with locks on the Unfortunately I can't test this script for real, as it simply doesn't work on my vagrant box, I suspect the problem lay with vagrant however, as the machine keeps just quitting on me. |
You could use lxd-benchmark, included in the main git repository to test parallel startup time. Last time I did some testing and fixes around that, I could easily get to 8-10 containers a second on a dual quadcore server so we either severely regressed from then or there's something else which differs between our test environments. |
Hi @stgraber thanks for checking in, I'll direct my team to test using |
Whilst waiting for my team so I can corroborate the issue, can you speak at all about the choice of extremely naïve locking of the sqlite3 DB versus a design with one goroutine synchronizing the database access? |
No particular reason other than simplicity, in our tests the database has never been the bottleneck so we never felt like there was a reason to focus on changing such a critical and potentially fragile piece of the code. |
Hi @stgraber, I'm working on the team Lee mentioned :) We did a benchmark on our server using
Is this a configuration mistake on our end or might this be related to the locking issue? |
That's with LXD 2.0.4 from Ubuntu 16.04 on a rather old server (dual Xeon E5430 with 12GB of RAM). Note that the choice of alpine and --privileged helps getting cleaner benchmark results as no filesystem remapping is done and the containers themselves don't do very much at all when they start compared to a full Ubuntu system. |
Thanks for the reply @stgraber. I'm not sure what to make of it, you're telling us that the performance characteristics are different when LXD is used in a totally different way that doesn't suit our workload? I'd like to get the issue back to the current reported issue, given our workload, LXD is at least a couple or order of magnitude slower than other solutions (which we can't use...), when the host system is unloaded, hence our suspicion at a (live-)locking problem. I will endeavour to benchmark this some more and come back to you, and we can decide how to proceed with my report. Thanks for your time so far! |
What I'm saying is that if we suspect a database issue, it's best to remove other potential source of slowdowns from the benchmark which is what I did above. |
I'll run both benchmarks with some UDP (statsd) logging enabled, I'll get the data plotted and we can compare results. |
For comparison, that's the result of the same benchmark you ran:
So we see a speed decrease as the machine gets loaded (again, running this on CPUs that are approaching 10 years old) but it's still nowhere near as bad as what you're getting. |
Ok, so I found one thing that's very weird. Performance is massively better with the lxd that's shipped by Ubuntu vs a hand built one. ubuntu version
Same thing but hand built
|
The main difference I can think of is that the packaged lxd is built using source dependencies from the Ubuntu archive rather than the latest version from upstream. |
Confirmed that re-building with the same set of source dependencies as Ubuntu gives me the same speed too... now to figure out which of those dependencies is causing that big speed change. Based on your feedback, I'm going to start by testing sqlite since that'd explain what you're seeing :) |
And sure enough, that's the one causing the speed difference. |
So yeah, looks like something happened to go-sqlite3 that makes it massively less performant in the way we use it now... Ubuntu and Debian ship a very old version of mattn/go-sqlite which isn't affected. |
Thanks for looking into that for us @stgraber - I tried to find out what refs of Since we're talking about cgo, I wonder if the general downturn in compiler speed after 1.5 is responsible? |
Hi @stgraber, I finally had time to look into the way Ubuntu packages things, it's not a domain I'm familiar with but here's what I found out, I'd be glad if you could validate my assumptions for me:
So, that gives us a solid timeline, confirmation then:
So, what changed in
Immediately, nothing jumps out in the compare view for me as being horrible, but if you can confirm that I'm at least barking up the right tree, I don't mind bisecting the builds of I haven't factored in the different Go versions yet, but I think the Xenial package I think we can rule out different Go versions, unless the builds of LXD for Xenial pre-date Feb '16, so it shouldn't be too hard to bisect the 11, commits between the tagged versions, or if there's a chance they've already fixed it on their |
Let me try to clarify a bit more where the problem is. If I build LXD master with Golang 1.7 using Ubuntu's golang-github-mattn-go-sqlite3-dev at version v1.1.0, I'm getting the good performance I mentioned above, that's the test build I did yesterday. Xenial and Yakkety both ship 1.1.0:
Now replace that packaged version of golang-github-mattn-go-sqlite3-dev with what they've got in git master and you get the terrible performance result. Because of the way we build things, this means that:
I suspect it'd be useful to confirm that using upstream's v1.1.0 branch fixes things. If it does, we can simply replace our import with "https://gopkg.in/mattn/go-sqlite3.v1" which would pull that particular branch instead of master. |
Thanks for the clarification, that helps a lot, I hope my "archeology" was at least borderline useful! I'll check with my team from where we're installing LXD, my understanding is we're literally running |
Hmm, so building using go-sqlite3.v1 doesn't help... |
I'm not on my work machine right now unfortunately, so I don't have a build environment to hand, but could you humour me and try introducing |
Sure, I'll do that now. I also tracked down the difference of performance in go-sqlite3.v1 vs Ubuntu package to be because Ubuntu links against the system libsqlite3 instead of using a bundled copy. |
LXD built with "go install -v -x github.com/lxc/lxd/lxd" => 0.63 containers/s |
That's quite the difference! Excellent detective work, and really, heartfelt kudos for being so engaged with us here, I sincerely appreciate it. Can I somehow help you in identifying which version of |
mattn/go-sqlite3#330 is about the -tags libsqlite3 trick not actually working so well... |
he's apparently using 3.14.0, Ubuntu is on 3.14.1, but I'd be surprised that the problem would be so trivial... I instead suspect it's got to do with the way cgo works when the code is embedded vs used through a shared library. Anyway, I'll refresh his embedded copy to 3.14.1 and see if that fixes things somehow. |
Hmm, actually, Ubuntu has 3.14.1 now but Xenial had 3.11, so it's almost certainly not the problem since Ubuntu's 3.11 was fast too. |
Confirmed that updating with a copy of 3.14.1 doesn't fix the performance problem. |
As for the rand.Intn idea, I applied http://paste.ubuntu.com/23212369/ and I'm getting 0.79 containers/s so nowhere near the kind of performance boost you get from using the system libsqlite3.so |
Thanks @stgraber I really appreciate you shooting in the dark with my bonkers idea there :) I didn't even know that Go could bundle shared libraries to use as part of the binary.... I'm trying to read into cgo a bit more to improve my understanding in this field. Thanks for opening the issue at mattn/go-sqlite3#330, I've subscribed |
Is it as simple as static vs. dynamic linking ? |
It could be, yes, I'm still not sure why it'd matter though... All it should be doing is add a few kB to the binary for that extra section rather than load it from somewhere else at startup... |
Talking to someone on my team today, they theorised that maybe Unfortunately I know virtually nothing about cgo and how it deals with linking, or bundling dynamic libraries to make portable dystatic binaries. |
So I did a bit more digging, wanted to leave my observations here, for you and/or any of my team who're checking-in in the morning before I do:
This is what their docs say (emphasis mine):
So, I also noted that running the following on
Gives me I can confirm your results about the
The flag Which is strange because this looks like it's setting the appropriate
Curiously You mentioned something about removing the bundled version, but I wasn't able to find out what you were referring to? |
go to $GOPATH/src/github.com/mattn/go-sqlite3 and remove sqlite3-binding.c, sqlite3-binding.h, then run the go install again and LXD will be built using the shared library. |
Thanks, I thought you'd modified the binary somehow, I was wondering exactly which kind of wizard you were. Then seems like it's an interplay between linker lfags and cflags not doing the right thing? |
Yeah, my current assumption is that cgo builds all .c files, then the linker just picked up the local symbols over the system ones, which makes sense. I suspect that just moving the .c and .h files into a sub-directory may fix this issue but I haven't had time to test yet. |
Interestingly, final point, our horrible performance is on 16.04 with 2.0.4 with a version using the shared system library at |
Excellent intuition, I'd not have guessed at that, but I can imagine that! |
And you can confirm that "ldd /usr/bin/lxd" on your test systems shows that it's linked with the system's libsqlite3? If so, then this is very weird as I certainly cannot reproduce the issue on any of my systems unless I use a build of lxd that's not linked against it. |
Confirmed, barring user error. Process tree shows |
Hmm, that's a bit puzzling then, I wonder what else would be causing the exact same kind of performance degradation for you if it's not that issue... Also weird that I can't reproduce it while using the same LXD version, kernel and filesystem driver. |
Tomorrow we'll write a short script to start lxd with the -cpuprofile flag I was glad to see Go's profiling tools baked in already when I forked lxd I hope this shows us where we're spending the most time. |
Yeah, I always restart the lxd daemon before doing a benchmark run, I've not noticed a significant slowdown overtime, but it's always best to have a clean, identical setup when doing benchmarks :) |
CPU profiling leads to very interesting results, our binary and CPU profile and test script are attached to this comment. I will now close this issue and open a fresh one... bottom line seems to be us losing most of our time in That leads me to suspect kernel version, or configuration as it deals solely with the procfs... but I'd appreciate a second opinion on our methodology. Our test Script:
Output of LXD bench for this run:
Result of calling Our exact binary version, and the CPU profile: |
Is that cpu time or wall time? shift owner touches at every file in the container, so I'd expect it to eat a lot of wall time, but not necessarily a lot of CPU time. |
Honestly, I don't know, I assume it's CPU time, since this is a CPU profile - just thinking out loud, but thanks for the hint to go look more closely at that. |
We have another datapoint which is upgrading from
|
Ok. It looks like that function is switching to C, so it's possible that the context switch from go to cgo/c is expensive, although I wouldn't necessarily expect it. The C function doesn't look like it really has to be written in C, though, so it might be worth a science experiment to see if doing it in go is any faster. Anyway, thanks for looking into it! |
I think it must be CPU time, the graph shows: however
That's a pretty huge time difference, so either we lose a massive amount of time in the Unfortunately I don't think the cpu profiling will help us here, as, as verified by our |
Required information
Issue description
We run an environment where the
lxc
command line tool is used to run short-lived workloads in containers. Thelxc
commands are dispatched into the background with afork/exec/detach
process (very typical) from our worker processes.The behaviour we observe is a progressive slowing-down of the system in spite of negligible host load. With three parallel processes we have a throughput of ~3.8 containers per minute, with more that drops significantly.
Initial research suggests that this is related to a live-lock contention of
lxd.db
as processes have a fixed 100x100ms retry behaviour which causes all the processes to line up and bang on the lock simultaneously.First indications are that introducing a little entropy into the loop sleep times increases throughput dramatically, such as adding
rand(100,i)
ms sleep to each loop interationSteps to reproduce
for i in \
seq 1 100; do lxc exec ubuntu-32 sleep 500 &; done
lxc
processes hanging. Output oflsof /var/lib/lxd/lxd.db
shows~3+n
(wheren
is roughly the number of times we've dispatched anlxc exec
command)I'm not quite sure how best to time the issue, I thought about the following:
1s
, so is unsuitable.Before attempting a PR I'd like to know what, if any discussion has been had around the internal database, it's design choices, the implementation (~3+n × the same file open) and the naïve choice of sleep/wait with the locking of the database. Before potentially pushing an unwelcome change to the internals.
The text was updated successfully, but these errors were encountered: