Re: [PATCH -next v2 8/9] block: fix null-pointer dereference in ioc_pd_init

Tejun Heo <tj@xxxxxxxxxx> · Mon, 12 Dec 2022 13:10:26 -1000

Hello,

On Mon, Dec 05, 2022 at 05:32:17PM +0800, Yu Kuai wrote:
> 1) queue_lock is held to protect rq_qos_add() and rq_qos_del(), whlie
> it's not held to protect rq_qos_exit(), which is absolutely not safe
> because they can be called concurrently by configuring iocost and
> removing device.
> I'm thinking about holding the lock to fetch the list and reset
> q->rq_qos first:
> 
> diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
> index 88f0fe7dcf54..271ad65eebd9 100644
> --- a/block/blk-rq-qos.c
> +++ b/block/blk-rq-qos.c
> @@ -288,9 +288,15 @@ void rq_qos_wait(struct rq_wait *rqw, void
> *private_data,
> 
>  void rq_qos_exit(struct request_queue *q)
>  {
> -       while (q->rq_qos) {
> -               struct rq_qos *rqos = q->rq_qos;
> -               q->rq_qos = rqos->next;
> +       struct rq_qos *rqos;
> +
> +       spin_lock_irq(&q->queue_lock);
> +       rqos = q->rq_qos;
> +       q->rq_qos = NULL;
> +       spin_unlock_irq(&q->queue_lock);
> +
> +       while (rqos) {
>                 rqos->ops->exit(rqos);
> +               rqos = rqos->next;
>         }
>  }
> 
> 2) rq_qos_add() can still succeed after rq_qos_exit() is done, which
> will cause memory leak. Hence a checking is required beforing adding
> to q->rq_qos. I'm thinking about flag QUEUE_FLAG_DYING first, but the
> flag will not set if disk state is not marked GD_OWNS_QUEUE. Since
> blk_unregister_queue() is called before rq_qos_exit(), use the queue
> flag QUEUE_FLAG_REGISTERED should be OK.
> 
> For the current problem that device can be removed while initializing
> , I'm thinking about some possible solutions:
> 
> Since bfq is initialized in elevator initialization, and others are
> in queue initialization, such problem is only possible in iocost, hence
> it make sense to fix it in iocost:

So, iolatency is likely to switch to similar lazy init scheme, so we better
fix it in the rq_qos / core block layer.

...
> 3) Or is it better to fix it in the higher level? For example:
> add a new restriction that blkcg_deactivate_policy() should be called
> with blkcg_activate_policy() in pairs, and blkcg_deactivate_policy()
> will wait for blkcg_activate_policy() to finish. Something like:
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index ef4fef1af909..6266f702157f 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1410,7 +1410,7 @@ int blkcg_activate_policy(struct request_queue *q,
>         struct blkcg_gq *blkg, *pinned_blkg = NULL;
>         int ret;
> 
> -       if (blkcg_policy_enabled(q, pol))
> +       if (WARN_ON_ONCE(blkcg_policy_enabled(q, pol)))
>                 return 0;
> 
>         if (queue_is_mq(q))
> @@ -1477,6 +1477,8 @@ int blkcg_activate_policy(struct request_queue *q,
>                 blkg_put(pinned_blkg);
>         if (pd_prealloc)
>                 pol->pd_free_fn(pd_prealloc);
> +       if (!ret)
> +               wake_up(q->policy_waitq);
>         return ret;
> 
>  enomem:
> @@ -1512,7 +1514,7 @@ void blkcg_deactivate_policy(struct request_queue *q,
>         struct blkcg_gq *blkg;
> 
>         if (!blkcg_policy_enabled(q, pol))
> -               return;
> +               wait_event(q->policy_waitq, blkcg_policy_enabled(q, pol));
>    wait_event(q->xxx, blkcg_policy_enabled(q, pol));

Yeah, along this line but hopefully something simpler like a mutex.

Thanks.

-- 
tejun