Re: [PATCH v2 1/2] blk-iocost: add refcounting for iocg

Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> · Thu, 12 Jan 2023 14:18:15 +0800

Hi,

在 2023/01/12 1:07, Tejun Heo 写道:
Hello,

On Wed, Jan 11, 2023 at 09:36:25AM +0800, Yu Kuai wrote:
I'm not sure, of course this can fix the problem, but two spinlock
'blkcg->lock' and 'q->queue_lock' are used to protect blkg_destroy()
currently, add a mutex（disk level?) requires a refactor, which seems
complex to me.

The fact that the two paths can race each other already seems buggy. e.g.
What prevents them from running pd_free on the same pd twice? So, it needs

I think the root cause is that blkg is tracked from two different list,
blkcg->blkg_list from cgroup level and q->blkg_list from disk level. And
pd_free_fn is also called from both blkg_destroy() and deactivate policy
for a disk.

I just thought about another solution:

remove the blkcg_deactivate_policy() from rq_qos_exit() from deleting
the device, and delay the policy cleanup and free to blkg_destroy_all().
Then the policies(other than bfq) can only call pd_free_fn() from
blkg_destroy(), and it's easy to guarantee the order. For bfq, it can
stay the same since bfq has refcounting itself.

Then for the problem that ioc can be freed in pd_free_fn(), we can fix
it by freeing ioc in ioc_pd_free() for root blkg instead of
rq_qos_exit().

What do you think?

Thanks,
Kuai
to be fixed anyway and the intention always has been that these callbacks
are called in the correct traversal order.

Thanks.