Hi Tejun, Could you please help take a look at this and give a confirmation? Thanks, Joseph On 18/2/24 09:45, Joseph Qi wrote: > Hi Tejun, > > On 18/2/23 22:23, Tejun Heo wrote: >> Hello, >> >> On Fri, Feb 23, 2018 at 09:56:54AM +0800, xuejiufei wrote: >>>> On Thu, Feb 22, 2018 at 02:14:34PM +0800, Joseph Qi wrote: >>>>> I still don't get how css_tryget can work here. >>>>> >>>>> The race happens when: >>>>> 1) writeback kworker has found the blkg with rcu; >>>>> 2) blkcg is during offlining and blkg_destroy() has already been called. >>>>> Then, writeback kworker will take queue lock and access the blkg with >>>>> refcount 0. >>>> >>>> Yeah, then tryget would fail and it should go through the root. >>>> >>> In this race, the refcount of blkg becomes zero and is destroyed. >>> However css may still have refcount, and css_tryget can return success >>> before other callers put the refcount. >>> So I don't get how css_tryget can fix this race? Or I wonder if we can >>> add another function blkg_tryget? >> >> IIRC, as long as the blkcg and the device are there, the blkgs aren't >> gonna be destroyed. So, if you have a ref to the blkcg through >> tryget, the blkg shouldn't go away. >> > > Maybe we have misunderstanding here. > > In this case, blkg doesn't go away as we have rcu protect, but > blkg_destroy() can be called, in which blkg_put() will put the last > refcnt and then schedule __blkg_release_rcu(). > > css refcnt can't prevent blkcg css from offlining, instead it is css > online_cnt. > > css_tryget() will only get a refcnt of blkcg css, but can't be > guaranteed to fail when css is confirmed to kill. > > The race sequence: > writeback kworker cgroup_rmdir > cgroup_destroy_locked > kill_css > css_killed_ref_fn > css_killed_work_fn > offline_css > blkcg_css_offline > blkcg_bio_issue_check > rcu_read_lock > blkg_lookup > spin_trylock(q->queue_lock) > blkg_destroy > spin_unlock(q->queue_lock) > blk_throtl_bio > spin_lock_irq(q->queue_lock) > spin_unlock_irq(q->queue_lock) > rcu_read_unlock > > Thanks, > Joseph >