Hi Tejun, Thanks very much for reviewing this patch. On 18/2/8 05:38, Tejun Heo wrote: > Hello, Joseph. > > On Wed, Feb 07, 2018 at 04:40:02PM +0800, Joseph Qi wrote: >> writeback kworker >> blkcg_bio_issue_check >> rcu_read_lock >> blkg_lookup >> <<< *race window* >> blk_throtl_bio >> spin_lock_irq(q->queue_lock) >> spin_unlock_irq(q->queue_lock) >> rcu_read_unlock >> >> cgroup_rmdir >> cgroup_destroy_locked >> kill_css >> css_killed_ref_fn >> css_killed_work_fn >> offline_css >> blkcg_css_offline >> spin_trylock(q->queue_lock) >> blkg_destroy >> spin_unlock(q->queue_lock) > > Ah, right. Thanks for spotting the bug. > >> Since rcu can only prevent blkg from releasing when it is being used, >> the blkg->refcnt can be decreased to 0 during blkg_destroy and schedule >> blkg release. >> Then trying to blkg_get in blk_throtl_bio will complains the WARNING. >> And then the corresponding blkg_put will schedule blkg release again, >> which result in double free. >> This race is introduced by commit ae1188963611 ("blkcg: consolidate blkg >> creation in blkcg_bio_issue_check()"). Before this commit, it will lookup >> first and then try to lookup/create again with queue_lock. So revive >> this logic to fix the race. > > The change seems a bit drastic to me. Can't we do something like the > following instead? > > blk_throtl_bio() > { > ... non throttled cases ... > > /* out-of-limit, queue to @tg */ > > /* > * We can look up and retry but the race window is tiny here. > * Just letting it through should be good enough. > */ > if (!css_tryget(blkcg->css)) > goto out; > > ... actual queueing ... > css_put(blkcg->css); > ... > } So you mean checking css->refcnt to prevent the further use of blkg_get? I think it makes sense. IMO, we should use css_tryget_online instead, and rightly after taking queue_lock. Because there may be more use of blkg_get in blk_throtl_bio in the futher. Actually it already has two now. One is in blk_throtl_assoc_bio, and the other is in throtl_qnode_add_bio. What do you think of this? Thanks, Joseph -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html