Hello Hillf. (sorry for later reply) On Wed, Sep 11, 2024 at 07:15:42PM GMT, Hillf Danton <hdanton@xxxxxxxx> wrote: > > However, there is no ordering between (I) and (II) so they can also happen > > in opposite > > > > thread T system_wq worker > > > > down(cpu_hotplug_lock.read) > > smp_call_on_cpu > > queue_work_on(cpu, system_wq, scss) (I) > > lock(cgroup_mutex) (II) > > ... > > unlock(cgroup_mutex) > > scss.func > > wait_for_completion(scss) > > up(cpu_hotplug_lock.read) > > > > And here the thread T + system_wq worker effectively call > > cpu_hotplug_lock and cgroup_mutex in the wrong order. (And since they're > > two threads, it won't be caught by lockdep.) > > > Given no workqueue work executed without being dequeued, any queued work, > regardless if they are more than 2048, that acquires cgroup_mutex could not > prevent the work queued by thread-T from being executed, so thread-T can > make safe forward progress, therefore with no chance left for the ABBA > deadlock you spotted where lockdep fails to work. Is there a forgotten negation and did you intend to write: "any queued work ... that acquired cgroup_mutex could prevent"? Or if the negation is correct, why do you mean that processed work item is _not_ preventing thread T from running (in the case I left quoted above)? Thanks, Michal
Attachment:
signature.asc
Description: PGP signature