Re: [PATCH 2/2] cgroup: Use separate work structs on css release path

Michal Koutný <mkoutny@xxxxxxxx> · Thu, 2 Jun 2022 13:47:05 +0200

On Wed, Jun 01, 2022 at 05:40:51PM -0700, Tadeusz Struk <tadeusz.struk@xxxxxxxxxx> wrote:
> css_killed_ref_fn() will be called regardless of the value of refcnt (via percpu_ref_kill_and_confirm())
> and it will only enqueue the css_killed_work_fn() to be called later.
> Then css_put()->css_release() will be called before the css_killed_work_fn() will even
> get a chance to run, and it will also *only* enqueue css_release_work_fn() to be called later.
> The problem happens on the second enqueue. So there need to be something in place that
> will make sure that css_killed_work_fn() is done before css_release() can enqueue
> the second job.

IIUC, here you describe the same scenario I broke down at [1].

> Does it sound right?

I added a parameter A there (that is sum of base and percpu references
before kill_css()).
I thought it fails because A == 1 (i.e. killing the base reference),
however, that seems an unlikely situation (because cgroup code uses a
"fuse" reference to pin css for offline_css()).

So the remaining option (at least I find it more likely now) is that
A == 0 (A < 0 would trigger the warning in
percpu_ref_switch_to_atomic_rcu()), aka the ref imbalance. I hope we can
get to the bottom of this with detailed enough tracing of gets/puts.

Splitting the work struct is condradictive to the existing approach with
the "fuse" reference.

(BTW you also wrote On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk <tadeusz.struk@xxxxxxxxxx> wrote:
> The fact the css_release() is called (via cgroup_kn_unlock()) just after
> kill_css() causes the css->destroy_work to be enqueued twice on the same WQ
> (cgroup_destroy_wq), just with different function. This results in the
> BUG: corrupted list in insert_work issue.

Where do you see a critical css_release called from cgroup_kn_unlock()?
I always observed the css_release() being called via
percpu_ref_call_confirm_rcu() (in the original and subsequent syzbot
logs.))

Thanks,
Michal

[1] https://lore.kernel.org/r/Yo7KfEOz92kS2z5Y@blackbook/