On Tue, Feb 01, 2022 at 02:56:23PM -0600, Jeremy Linton wrote: > With CONFIG_MEMCG_KMEM and CONFIG_PROVE_LOCKING enabled (fedora > rawhide kernel), running a simple podman test tosses a circular > locking dependency warning. The podman container in question simpy > contains the echo command and the libc/ld-linux needed to run it. The > warning can be duplicated with just a single `podman build --network > host --layers=false -t localhost/echo .` command, although the exact > sequence that triggers the warning needs the task state to be changing > the frozen state as well. So, its easier to duplicate with a slightly > longer test case. > > I've attempted to trigger the actual deadlock with some standalone > code and been unsuccessful, but looking at the code it appears to be a > legitimate deadlock if a signal is being sent to the process from > another thread while the task is migrating between cgroups. > > Attached is a fix which I'm confident fixes the problem, but I'm not > really that confident in the fix since I don't fully understand all > the possible states in the cgroup code. The fix avoids the deadlock by > shifting the objcg->list manipulation to another spinlock and then > using list_del_rcu in obj_cgroup_release. > > There is a bit more information in the actual BZ > https://bugzilla.redhat.com/show_bug.cgi?id=2033016 including a shell > script with the podman test/etc. Hi Jeremy! Thank you for the report and the patch! We've discussed this issue some time ago and I posted a very similar patch: https://marc.info/?l=linux-cgroups&m=164221633621286&w=2 . Also I did resend the latest version few hours ago, but somehow the mail didn't make it to the mailing lists. Anyway, I've added you explicitly to cc@ and just resent. Thanks!