On Mon, Dec 22, 2014 at 10:45:33AM +0100, Martin Kletzander wrote:
On Wed, Dec 17, 2014 at 04:59:30PM -0700, Eric Blake wrote:On 12/17/2014 08:06 AM, Martin Kletzander wrote:On Wed, Dec 17, 2014 at 12:00:36AM -0700, Eric Blake wrote:On 12/16/2014 11:51 PM, Eric Blake wrote:On 12/15/2014 12:58 AM, Martin Kletzander wrote:Instead of setting the value of cpuset.mems once when the domain starts and then re-calculating the value every time we need to change the child cgroup values, leave the cgroup alone and rather set the child data every time there is new cgroup created. We don't leave any task in the parent group anyway. This will ease both current and future code. Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> --- src/qemu/qemu_cgroup.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_driver.c | 59 +++++++++++++++----------------------------- 2 files changed, 85 insertions(+), 41 deletions(-)This patch causes libvirtd to segfault on startup:More particularly, I'm doing an upgrade from older libvirtd to this version, while leaving a transient domain running that was started by the older libvirtd. Hope that helps you narrow in on the problem.Weird - At the time I made the report, I ran 'git bisect' and reliably reproduced the crash across multiple libvirtd restarts (a restart for each new build while trying to nail down the culprit commit), as long as I left that transient domain running.
I managed to reproduce this (or actually talk to a person who did) and found out a bit more. As you say, until the domain was running everything was reproducible. However, when I started it with the old libvirt second time to find out the root cause, it just worked...
I tried that and it works for me. And I tried various domains, both heavily cgroup dependent and simple ones.But now that I've rebooted, and then proceeded to do incremental builds from both before and after the patch, I can't reproduce the crash. Although my formula for creating my transient domain was the same both yesterday and today, there may be some difference in the version of libvirtd that was running at the time I created the domain that then affected the XML affecting the libvirtd restarts.Reverting 86759e and af2a1f0 was sufficient to get me going again locally, but I'm not going to push my reversions until you've first had a chance to address the regression.#2 0x00007ffff7405673 in virCgroupHasEmptyTasks (cgroup=0x0, controller=2) at util/vircgroup.c:3935From this line it looks like priv->cgroup was not initialized. I did not add a check for that, so that may be the cause. I'll send a patch soon. But I wonder how did you manage to do that, is that a session libvirtd you restarted? Otherwise how come virCgroupNewDetectMachine() didn't fill the cgroup?I'm not spotting anything obvious why it wasn't initialized at the time I reproduced the crash, nor why I can't reproduce it now. Hopefully it's not a lurking time bomb.
It might be and it may cause more problems that just non-working hot-plugging of vCPUs. And it is not caused by these patches I added. The thing is that old libvirt starts a domain, but it *somehow* (this is the part I was not able to find out yet) doesn't add the task into the cgroup hierarchy for all cgroup controllers. The /proc/$(pidof qemu-kvm)/cgroups file looks like this then: 10:hugetlb:/ 9:perf_event:/machine.slice/machine-qemu\x2da.scope 8:blkio:/ 7:net_cls:/machine.slice/machine-qemu\x2da.scope 6:freezer:/machine.slice/machine-qemu\x2da.scope 5:devices:/ 4:memory:/ 3:cpuacct,cpu:/ 2:cpuset:/machine.slice/machine-qemu\x2da.scope/emulator 1:name=systemd:/machine.slice/machine-qemu\x2da.scope And when new libvirtd is starting, it detects the cgroup mounts and placements (in this example the placement for memory cgroup is "/") and the it goes and validates them. If the cgroup doesn't have a placement that ends with machine-qemu-blah or libvirt-qemu or whatever, the following condition in virCgroupValidateMachineGroup() fails: if (STRNEQ(tmp, name) && STRNEQ(tmp, partname) && STRNEQ(tmp, scopename)) [in here the variable tmp points after the last slash in the partition string] The function returns false, hence virCgroupNewDetectMachine() frees the cgroup (and sets it to NULL) and returns success. This means we will still properly detect the machine but we won't have access to any cgroup settings of that particular machine. The patch I posted fixes it to the extent that the daemon doesn't crash, but it doesn't fix the root cause (which is unknown). I played with it a bit more and I've found out that a domain, which had its own partition in all the cgroups controllers when started, *lost* some of them after a while. I might have been caused by libvirtd restarting, but that's highly unlikely, I guess. Even though inotify works for cgroup fs it does not show us who accessed it. I'll have to resort to using systemtap again.
If it happens again, I'll definitely report it; but for now, without a reliable reproduction, it could easily have been something caused on my end (since it is my dev machine, it may have been caused by a half-baked patch on my end that I was testing at the time I created the transient domain, where using only upstream patches wouldn't see the issue). So for now, don't worry too hard if you can't find it either.There is a valid code path that allows the cgroup to be NULL, although not easy to reach. However, since it is valid, we added a patch that just skips the cgroup restoration for cgroup == NULL. If you happen to stumble upon a similar problem, I'd be happy to try dealing with it. Martin
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list