Re: Container restart issue: Failed to attach 1 to compat systemd cgroup

Lewis Gaul <lewis.gaul@xxxxxxxxx> · Thu, 12 Jan 2023 15:48:58 +0000

Another data point: I can reproduce on Ubuntu 18.04 host which has systemd v237 in *hybrid* cgroup mode (assuming I've understood the definition of hybrid, as per my previous email). So it's looking like it might be an issue with interoperation between host and container systemd, introduced somewhere between v239 and v245 for host systemd when the container is running v245 (also seen with v244 and v249).
Thanks,
Lewis

On Thu, 12 Jan 2023 at 15:31, Lewis Gaul <lewis.gaul@xxxxxxxxx> wrote:
Hey Michal,
Thanks for the reply.

> I'd suggest looking at debug level logs from the hosts systemd around the time of the container restart.

Could you suggest commands to run to do this?

> What is the host's systemd version and cgroup mode (legacy,hybrid,unified)? (I'm not sure what the distros in your original message referred to.)

The issue has been seen on Centos 8.2 and 8.4 host distro, but not seen on Ubuntu 20.04. The former has systemd v239 and appears to be in 'legacy' cgroup mode (no /sys/fs/cgroup/unified cgroup2 mount), whereas the latter has systemd v245 and is in what I believe you'd refer to as 'hybrid' mode (with the /sys/fs/cgroup/unified cgroup2 mount).

Should we be suspicious of the host systemd version and/or the fact that the host is in 'legacy' mode while the container (based on the systemd version being higher) is in 'hybrid' mode? Maybe we should try telling the container systemd to run in 'legacy' mode somehow?

Thanks,
Lewis

On Thu, 12 Jan 2023 at 13:12, Michal Koutný <mkoutny@xxxxxxxx> wrote:
Hello.

On Tue, Jan 10, 2023 at 03:28:04PM +0000, Lewis Gaul <lewis.gaul@xxxxxxxxx> wrote:

> I can confirm that the container has permissions since executing a 'mkdir'

> in /sys/fs/cgroup/systemd/machine.slice/libpod-<ctr-id>.scope/ inside the

> container succeeds after the restart, so I have no idea why systemd is not

> creating the 'init.scope/' dir.

It looks like it could also be a race/deferred impact from host's systemd.

> I notice that inside the container's systemd cgroup mount

> 'system.slice/' does exist, but 'user.slice/' also does not (both

> exist on normal boot). Is there any way I can find systemd logs that

> might indicate why the cgroup dir creation is failing?

I'd suggest looking at debug level logs from the hosts systemd around

the time of the container restart.

> I could raise this with the podman team, but it seems more in the systemd

> area given it's a systemd warning and I would expect systemd to be creating

> this cgroup dir?

What is the host's systemd version and cgroup mode

(legacy,hybrid,unified)? (I'm not sure what the distros in your original

message referred to.)

Thanks,

Michal