On Sat, Dec 24, 2016 at 05:14:44PM +0100, Guido Günther wrote: > Hi Cedric,x > On Wed, Dec 21, 2016 at 02:36:39PM +0100, Cedric Bosdonnat wrote: > > Hey Christian, > > > > On Tue, 2016-12-20 at 12:29 +0100, Christian Ehrhardt wrote: > > > Hi, > > > I found an issue in libvirt related to libvirt-lxc, but fail to find the root cause. > > > > > > The TL;DR is: libvirt-lxc guests get killed on libvirt restart due to "internal error: No valid cgroup for machine" > > > > > > It was able to reproduce libvirt 1.3.1, 2.4 and 2.5 as packages in Ubuntu and Debian. > > > I wanted to ask for two things: > > > - wider coverage where this does reproduce > > > > I couldn't reproduce here with openSUSE Tumbleweed and libvirt 2.5 packages. > > I had a short look and it seems like this sequence is killing all running > libvirt-lxc guests reliably: > > # no lxc guest running yet > export LIBVIRT_DEFAULT_URI=lxc:/// > DOMAIN=sl > systemctl daemon-reload > > # start lxc guest > virsh start ${DOMAIN} > sleep 1 # give vm some time to start > systemctl restart libvirtd Using ftrae I can see that systemd moves the process into the wrong cgroup on start: systemd-1 [000] .... 652.333068: cgroup_attach_task: dst_root=3 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc systemd-1 [000] .... 652.333117: cgroup_attach_task: dst_root=3 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc systemd-1 [000] .... 652.333160: cgroup_attach_task: dst_root=6 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc systemd-1 [000] .... 652.333203: cgroup_attach_task: dst_root=4 dst_id=107 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc systemd-1 [000] .... 652.333245: cgroup_attach_task: dst_root=8 dst_id=80 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc systemd-1 [000] .... 652.333286: cgroup_attach_task: dst_root=7 dst_id=84 dst_level=2 dst_path=/system.slice/libvirtd.service pid=4073 comm=libvirt_lxc I've attached the script to reproduce this and would be happy about ideas of the root cause. Cheers, -- Guido
#!/bin/bash set -e export LIBVIRT_DEFAULT_URI=lxc:/// DOMAIN=sl function cleanup () { set +x echo "Running cleanup" echo 0 > /sys/kernel/debug/tracing/events/cgroup/enable virsh -c lxc:/// destroy sl || true if [ -n "$SUCCESS" ]; then echo "Finished succesfully" else echo "Got an error." fi } trap cleanup exit cat <<EOF >dom.xml <domain type='lxc'> <name>sl</name> <memory unit='KiB'>256000</memory> <currentMemory unit='KiB'>256000</currentMemory> <vcpu placement='static'>1</vcpu> <os> <type>exe</type> <init>/bin/bash</init> </os> <features> <privnet/> </features> <clock offset='utc'/> <devices> <filesystem type='mount' accessmode='passthrough'> <source dir='/'/> <target dir='/'/> </filesystem> <console type='pty'> <target type='lxc' port='0'/> </console> </devices> </domain> EOF virsh define dom.xml || true echo 1 > /sys/kernel/debug/tracing/events/cgroup/enable # Restart systemd, this triggers the problem echo "systemctl deamon-reload start" > /sys/kernel/debug/tracing/trace_marker systemctl daemon-reload echo "systemctl deamon-reload finished" > /sys/kernel/debug/tracing/trace_marker set -x # Start the lxc container echo "virsh start ${DOMAIN} start" > /sys/kernel/debug/tracing/trace_marker virsh start ${DOMAIN} echo "virsh start ${DOMAIN} finished" > /sys/kernel/debug/tracing/trace_marker virsh list PID=$(virsh -c lxc:/// list --state-running | sed -ne 's/ \([0-9]\+\) .*/\1/p') WATCH=/proc/$PID/cgroup echo "Before ${WATCH}" cat ${WATCH} sleep 1 # Restart libvirtd echo "sysemctl stop libvirtd start" > /sys/kernel/debug/tracing/trace_marker systemctl stop libvirtd echo "sysemctl stop libvirtd finished" > /sys/kernel/debug/tracing/trace_marker echo "sysemctl start libvirtd start" > /sys/kernel/debug/tracing/trace_marker systemctl start libvirtd echo "sysemctl start libvirtd finished" > /sys/kernel/debug/tracing/trace_marker # Check if container is still there echo "After" cat ${WATCH} if ! virsh list | grep -qs "${DOMAIN}[[:space:]]\+running"; then echo 'Domain disappeared!' exit 1 fi SUCCESS=1
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list