Re: [RFC][PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu online

Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> · Thu, 14 Jan 2021 22:21:25 -0500

Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> writes:
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>>> The nature of this bug is also described here (with different consequences):
>>> https://lore.kernel.org/lkml/20200211141554.24181-1-qais.yousef@xxxxxxx/
>>
>> Yeah, pesky deadlocks.. someone was going to try again.
>
> I dug up the synchronous patch
>
>     https://lore.kernel.org/lkml/1579878449-10164-1-git-send-email-prsood@xxxxxxxxxxxxxx/
>
> but surprisingly wasn't able to reproduce the lockdep splat from
>
>     https://lore.kernel.org/lkml/F0388D99-84D7-453B-9B6B-EEFF0E7BE4CC@xxxxxx/
>
> even though I could hit it a few weeks ago.

oh okay, you need to mount a legacy cpuset hierarchy.

So as the above splat shows, making cpuset_hotplug_workfn() synchronous
means cpu_hotplug_lock (and "cpuhp_state-down") can be acquired before
cgroup_mutex.

But there are at least four cgroup paths that take the locks in the
opposite order.  They're all the same, they take cgroup_mutex and then
cpu_hotplug_lock later on to modify one or more static keys.

cpu_hotplug_lock should probably be ahead of cgroup_mutex because the
latter is taken in a hotplug callback, and we should keep the static
branches in cgroup, so the only way out I can think of is moving
cpu_hotplug_lock to just before cgroup_mutex is taken and switching to
_cpuslocked flavors of the static key calls.

lockdep quiets down with that change everywhere, but it puts another big
lock around a lot of cgroup paths.  Seems less heavyhanded to go with
this RFC.  What do you all think?

Absent further discussion, Alexey, do you plan to post another version?