Hello, Michal. On Thu, May 26, 2022 at 11:56:34AM +0200, Michal Koutný wrote: > // ref=A: initial state > kill_css() > css_get // ref+=F == A+F: fuse > percpu_ref_kill_and_confirm > __percpu_ref_switch_to_atomic > percpu_ref_get > // ref += 1 == A+F+1: atomic mode, self-protection > percpu_ref_put > // ref -= 1 == A+F: kill the base reference > [via rcu] > percpu_ref_switch_to_atomic_rcu > percpu_ref_call_confirm_rcu > css_killed_ref_fn == refcnt.confirm_switch > queue_work(css->destroy_work) (1) > [via css->destroy_work] > css_killed_work_fn == wq.func > offline_css() // needs fuse > css_put // ref -= F == A: de-fuse > percpu_ref_put > // ref -= 1 == A-1: remove self-protection > css_release // A <= 1 -> 2nd queue_work explodes! I'm not sure I'm following it but it's perfectly fine to re-use the work item at this point. The work item actually can be re-cycled from the very beginning of the work function. The only thing we need to make sure is that we don't css_put() prematurely to avoid it being freed while we're using it. For the sharing to be a problem, we should be queueing the release work item while the destroy instance is still pending, and if that is the case, it doesn't really matter whether we use two separate work items or not. We're already broken and would just be shifting the problem to explode elsewhere. The only possibility that I can think of is that somehow we're ending up with an extra css_put() somewhere thus triggering the release path prematurely. If that's the case, we'll prolly need to trace get/puts to find out who's causing the ref imbalance. Thanks. -- tejun