On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk wrote: > On 4/22/22 04:05, Michal Koutny wrote: > > On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo <tj@xxxxxxxxxx> wrote: > >> If this is the case, we need to hold an extra reference to be put by the > >> css_killed_work_fn(), right? That put could trigger INIT_WORK in css_release() and warning [1] on init active (active state 0) object OTOH as the same css->destroy_work is used in both kill and release pathes. Hillf [1] https://lore.kernel.org/lkml/000000000000ff747805debce6c6@xxxxxxxxxx/ > > > > I looked into it a bit more lately and found that there already is such > > a fuse in kill_css() [1]. > > > > At the same type syzbots stack trace demonstrates the fuse is > > ineffective > > > >> css_release+0xae/0xc0 kernel/cgroup/cgroup.c:5146 (**) > >> percpu_ref_put_many include/linux/percpu-refcount.h:322 [inline] > >> percpu_ref_put include/linux/percpu-refcount.h:338 [inline] > >> percpu_ref_call_confirm_rcu lib/percpu-refcount.c:162 [inline] (*) > >> percpu_ref_switch_to_atomic_rcu+0x5a2/0x5b0 lib/percpu-refcount.c:199 > >> rcu_do_batch+0x4f8/0xbc0 kernel/rcu/tree.c:2485 > >> rcu_core+0x59b/0xe30 kernel/rcu/tree.c:2722 > >> rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2735 > >> __do_softirq+0x27e/0x596 kernel/softirq.c:305 > > > > (*) this calls css_killed_ref_fn confirm_switch > > (**) zero references after confirmed kill? > > > > So, I was also looking at the possible race with css_free_rwork_fn() > > (from failed css_create()) but that would likely emit a warning from > > __percpu_ref_exit(). > > > > So, I still think there's something fishy (so far possible only via > > artificial ENOMEM injection) that needs an explanation... > > I can't reliably reproduce this issue on neither mainline nor v5.10, where > syzbot originally found it. It still triggers for syzbot though. > > -- > Thanks, > Tadeusz