Hi,
在 2023/01/12 1:07, Tejun Heo 写道:
Hello,
On Wed, Jan 11, 2023 at 09:36:25AM +0800, Yu Kuai wrote:
I'm not sure, of course this can fix the problem, but two spinlock
'blkcg->lock' and 'q->queue_lock' are used to protect blkg_destroy()
currently, add a mutex(disk level?) requires a refactor, which seems
complex to me.
The fact that the two paths can race each other already seems buggy. e.g.
What prevents them from running pd_free on the same pd twice? So, it needs
I think the root cause is that blkg is tracked from two different list,
blkcg->blkg_list from cgroup level and q->blkg_list from disk level. And
pd_free_fn is also called from both blkg_destroy() and deactivate policy
for a disk.
I just thought about another solution:
remove the blkcg_deactivate_policy() from rq_qos_exit() from deleting
the device, and delay the policy cleanup and free to blkg_destroy_all().
Then the policies(other than bfq) can only call pd_free_fn() from
blkg_destroy(), and it's easy to guarantee the order. For bfq, it can
stay the same since bfq has refcounting itself.
Then for the problem that ioc can be freed in pd_free_fn(), we can fix
it by freeing ioc in ioc_pd_free() for root blkg instead of
rq_qos_exit().
What do you think?
Thanks,
Kuai
to be fixed anyway and the intention always has been that these callbacks
are called in the correct traversal order.
Thanks.