Re: [PATCH v2] blk-throttle: fix race between blkcg_bio_issue_check and cgroup_rmdir

Tejun Heo <tj@xxxxxxxxxx> · Wed, 7 Feb 2018 13:38:11 -0800

Hello, Joseph.

On Wed, Feb 07, 2018 at 04:40:02PM +0800, Joseph Qi wrote:
> writeback kworker
>   blkcg_bio_issue_check
>     rcu_read_lock
>     blkg_lookup
>     <<< *race window*
>     blk_throtl_bio
>       spin_lock_irq(q->queue_lock)
>       spin_unlock_irq(q->queue_lock)
>     rcu_read_unlock
> 
> cgroup_rmdir
>   cgroup_destroy_locked
>     kill_css
>       css_killed_ref_fn
>         css_killed_work_fn
>           offline_css
>             blkcg_css_offline
>               spin_trylock(q->queue_lock)
>               blkg_destroy
>               spin_unlock(q->queue_lock)

Ah, right.  Thanks for spotting the bug.

> Since rcu can only prevent blkg from releasing when it is being used,
> the blkg->refcnt can be decreased to 0 during blkg_destroy and schedule
> blkg release.
> Then trying to blkg_get in blk_throtl_bio will complains the WARNING.
> And then the corresponding blkg_put will schedule blkg release again,
> which result in double free.
> This race is introduced by commit ae1188963611 ("blkcg: consolidate blkg
> creation in blkcg_bio_issue_check()"). Before this commit, it will lookup
> first and then try to lookup/create again with queue_lock. So revive
> this logic to fix the race.

The change seems a bit drastic to me.  Can't we do something like the
following instead?

blk_throtl_bio()
{
	... non throttled cases ...

	/* out-of-limit, queue to @tg */

	/*
	 * We can look up and retry but the race window is tiny here.
	 * Just letting it through should be good enough.
	 */
	if (!css_tryget(blkcg->css))
		goto out;

	... actual queueing ...
	css_put(blkcg->css);
	...
}

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html