Hi Tejun, Sorry for the delayed reply. On 18/2/13 01:11, Tejun Heo wrote: > Hello, Joseph. > > On Fri, Feb 09, 2018 at 10:15:19AM +0800, Joseph Qi wrote: >> IIUC, we have to identify it is in blkcg_css_offline now which will >> blkg_put. Since percpu_ref_kill_and_confirm in kill_css will set flag >> __PERCPU_REF_DEAD, so we can use this to avoid the race. IOW, if >> __PERCPU_REF_DEAD is set now, we know blkcg css is in offline and >> continue access blkg may risk double free. Thus we choose to skip these >> ios. >> I don't get how css_tryget works since it doesn't care the flag >> __PERCPU_REF_DEAD. Also css_tryget can't prevent blkcg_css from >> offlining since the race happens blkcg_css_offline is in progress. >> Am I missing something here? > > Once marked dead, the ref is in atomic mode and css_tryget() would hit > the atomic counter. Here, we don't care about the offlining and > draining. A draining memcg can still have a lot of memory to be > written back attached to it and we don't want punt all of them to the > root cgroup. I still don't get how css_tryget can work here. The race happens when: 1) writeback kworker has found the blkg with rcu; 2) blkcg is during offlining and blkg_destroy() has already been called. Then, writeback kworker will take queue lock and access the blkg with refcount 0. So, I think we should take queue lock when lookup blkg if we want to throttle the ios during offline (the way this patch tries), or use css_tryget_online() to skip the further use of the risky blkg, which means these ios won't be throttled either. Thanks, Joseph -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html