Hello, On Fri, Feb 23, 2018 at 09:56:54AM +0800, xuejiufei wrote: > > On Thu, Feb 22, 2018 at 02:14:34PM +0800, Joseph Qi wrote: > >> I still don't get how css_tryget can work here. > >> > >> The race happens when: > >> 1) writeback kworker has found the blkg with rcu; > >> 2) blkcg is during offlining and blkg_destroy() has already been called. > >> Then, writeback kworker will take queue lock and access the blkg with > >> refcount 0. > > > > Yeah, then tryget would fail and it should go through the root. > > > In this race, the refcount of blkg becomes zero and is destroyed. > However css may still have refcount, and css_tryget can return success > before other callers put the refcount. > So I don't get how css_tryget can fix this race? Or I wonder if we can > add another function blkg_tryget? IIRC, as long as the blkcg and the device are there, the blkgs aren't gonna be destroyed. So, if you have a ref to the blkcg through tryget, the blkg shouldn't go away. Thanks. -- tejun