Hello, On Mon, Dec 05, 2022 at 05:32:17PM +0800, Yu Kuai wrote: > 1) queue_lock is held to protect rq_qos_add() and rq_qos_del(), whlie > it's not held to protect rq_qos_exit(), which is absolutely not safe > because they can be called concurrently by configuring iocost and > removing device. > I'm thinking about holding the lock to fetch the list and reset > q->rq_qos first: > > diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c > index 88f0fe7dcf54..271ad65eebd9 100644 > --- a/block/blk-rq-qos.c > +++ b/block/blk-rq-qos.c > @@ -288,9 +288,15 @@ void rq_qos_wait(struct rq_wait *rqw, void > *private_data, > > void rq_qos_exit(struct request_queue *q) > { > - while (q->rq_qos) { > - struct rq_qos *rqos = q->rq_qos; > - q->rq_qos = rqos->next; > + struct rq_qos *rqos; > + > + spin_lock_irq(&q->queue_lock); > + rqos = q->rq_qos; > + q->rq_qos = NULL; > + spin_unlock_irq(&q->queue_lock); > + > + while (rqos) { > rqos->ops->exit(rqos); > + rqos = rqos->next; > } > } > > 2) rq_qos_add() can still succeed after rq_qos_exit() is done, which > will cause memory leak. Hence a checking is required beforing adding > to q->rq_qos. I'm thinking about flag QUEUE_FLAG_DYING first, but the > flag will not set if disk state is not marked GD_OWNS_QUEUE. Since > blk_unregister_queue() is called before rq_qos_exit(), use the queue > flag QUEUE_FLAG_REGISTERED should be OK. > > For the current problem that device can be removed while initializing > , I'm thinking about some possible solutions: > > Since bfq is initialized in elevator initialization, and others are > in queue initialization, such problem is only possible in iocost, hence > it make sense to fix it in iocost: So, iolatency is likely to switch to similar lazy init scheme, so we better fix it in the rq_qos / core block layer. ... > 3) Or is it better to fix it in the higher level? For example: > add a new restriction that blkcg_deactivate_policy() should be called > with blkcg_activate_policy() in pairs, and blkcg_deactivate_policy() > will wait for blkcg_activate_policy() to finish. Something like: > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index ef4fef1af909..6266f702157f 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -1410,7 +1410,7 @@ int blkcg_activate_policy(struct request_queue *q, > struct blkcg_gq *blkg, *pinned_blkg = NULL; > int ret; > > - if (blkcg_policy_enabled(q, pol)) > + if (WARN_ON_ONCE(blkcg_policy_enabled(q, pol))) > return 0; > > if (queue_is_mq(q)) > @@ -1477,6 +1477,8 @@ int blkcg_activate_policy(struct request_queue *q, > blkg_put(pinned_blkg); > if (pd_prealloc) > pol->pd_free_fn(pd_prealloc); > + if (!ret) > + wake_up(q->policy_waitq); > return ret; > > enomem: > @@ -1512,7 +1514,7 @@ void blkcg_deactivate_policy(struct request_queue *q, > struct blkcg_gq *blkg; > > if (!blkcg_policy_enabled(q, pol)) > - return; > + wait_event(q->policy_waitq, blkcg_policy_enabled(q, pol)); > wait_event(q->xxx, blkcg_policy_enabled(q, pol)); Yeah, along this line but hopefully something simpler like a mutex. Thanks. -- tejun