Re: [PATCH v2] blk-throttle: fix race between blkcg_bio_issue_check and cgroup_rmdir

Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> · Thu, 8 Feb 2018 10:29:43 +0800

Hi Tejun,
Thanks very much for reviewing this patch.

On 18/2/8 05:38, Tejun Heo wrote:
> Hello, Joseph.
> 
> On Wed, Feb 07, 2018 at 04:40:02PM +0800, Joseph Qi wrote:
>> writeback kworker
>>   blkcg_bio_issue_check
>>     rcu_read_lock
>>     blkg_lookup
>>     <<< *race window*
>>     blk_throtl_bio
>>       spin_lock_irq(q->queue_lock)
>>       spin_unlock_irq(q->queue_lock)
>>     rcu_read_unlock
>>
>> cgroup_rmdir
>>   cgroup_destroy_locked
>>     kill_css
>>       css_killed_ref_fn
>>         css_killed_work_fn
>>           offline_css
>>             blkcg_css_offline
>>               spin_trylock(q->queue_lock)
>>               blkg_destroy
>>               spin_unlock(q->queue_lock)
> 
> Ah, right.  Thanks for spotting the bug.
> 
>> Since rcu can only prevent blkg from releasing when it is being used,
>> the blkg->refcnt can be decreased to 0 during blkg_destroy and schedule
>> blkg release.
>> Then trying to blkg_get in blk_throtl_bio will complains the WARNING.
>> And then the corresponding blkg_put will schedule blkg release again,
>> which result in double free.
>> This race is introduced by commit ae1188963611 ("blkcg: consolidate blkg
>> creation in blkcg_bio_issue_check()"). Before this commit, it will lookup
>> first and then try to lookup/create again with queue_lock. So revive
>> this logic to fix the race.
> 
> The change seems a bit drastic to me.  Can't we do something like the
> following instead?
> 
> blk_throtl_bio()
> {
> 	... non throttled cases ...
> 
> 	/* out-of-limit, queue to @tg */
> 
> 	/*
> 	 * We can look up and retry but the race window is tiny here.
> 	 * Just letting it through should be good enough.
> 	 */
> 	if (!css_tryget(blkcg->css))
> 		goto out;
> 
> 	... actual queueing ...
> 	css_put(blkcg->css);
> 	...
> }
So you mean checking css->refcnt to prevent the further use of
blkg_get? I think it makes sense.
IMO, we should use css_tryget_online instead, and rightly after taking
queue_lock. Because there may be more use of blkg_get in blk_throtl_bio
in the futher. Actually it already has two now. One is in
blk_throtl_assoc_bio, and the other is in throtl_qnode_add_bio.
What do you think of this?

Thanks,
Joseph
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html