Re: [PATCH v2] blk-throttle: fix race between blkcg_bio_issue_check and cgroup_rmdir

Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> · Thu, 22 Feb 2018 14:14:34 +0800

Hi Tejun,

Sorry for the delayed reply.

On 18/2/13 01:11, Tejun Heo wrote:
> Hello, Joseph.
> 
> On Fri, Feb 09, 2018 at 10:15:19AM +0800, Joseph Qi wrote:
>> IIUC, we have to identify it is in blkcg_css_offline now which will
>> blkg_put. Since percpu_ref_kill_and_confirm in kill_css will set flag
>> __PERCPU_REF_DEAD, so we can use this to avoid the race. IOW, if
>> __PERCPU_REF_DEAD is set now, we know blkcg css is in offline and
>> continue access blkg may risk double free. Thus we choose to skip these
>> ios.
>> I don't get how css_tryget works since it doesn't care the flag
>> __PERCPU_REF_DEAD. Also css_tryget can't prevent blkcg_css from
>> offlining since the race happens blkcg_css_offline is in progress.
>> Am I missing something here?
> 
> Once marked dead, the ref is in atomic mode and css_tryget() would hit
> the atomic counter.  Here, we don't care about the offlining and
> draining.  A draining memcg can still have a lot of memory to be
> written back attached to it and we don't want punt all of them to the
> root cgroup.

I still don't get how css_tryget can work here.

The race happens when:
1) writeback kworker has found the blkg with rcu;
2) blkcg is during offlining and blkg_destroy() has already been called.
Then, writeback kworker will take queue lock and access the blkg with
refcount 0.

So, I think we should take queue lock when lookup blkg if we want to
throttle the ios during offline (the way this patch tries), or use
css_tryget_online() to skip the further use of the risky blkg, which
means these ios won't be throttled either.

Thanks,
Joseph
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html