On Mon, Oct 21, 2019 at 03:42:57PM -0700, Bart Van Assche wrote: > If blk_poll() is called if no requests are in progress, it may happen that > blk_mq_update_nr_hw_queues() modifies the data structures used by blk_poll(), > e.g. q->queue_hw_ctx[]. Fix this race by serializing blk_poll() against > blk_mq_update_nr_hw_queues(). > > Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> > Cc: Ming Lei <ming.lei@xxxxxxxxxx> > Cc: Hannes Reinecke <hare@xxxxxxxx> > Cc: Johannes Thumshirn <jthumshirn@xxxxxxx> > Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> > --- > block/blk-mq.c | 38 +++++++++++++++++++++++++------------- > 1 file changed, 25 insertions(+), 13 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 7528678ef41f..ea64d951f411 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -3439,19 +3439,7 @@ static bool blk_mq_poll_hybrid(struct request_queue *q, > return blk_mq_poll_hybrid_sleep(q, hctx, rq); > } > > -/** > - * blk_poll - poll for IO completions > - * @q: the queue > - * @cookie: cookie passed back at IO submission time > - * @spin: whether to spin for completions > - * > - * Description: > - * Poll for completions on the passed in queue. Returns number of > - * completed entries found. If @spin is true, then blk_poll will continue > - * looping until at least one completion is found, unless the task is > - * otherwise marked running (or we need to reschedule). > - */ > -int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) > +static int __blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) > { > struct blk_mq_hw_ctx *hctx; > long state; > @@ -3503,6 +3491,30 @@ int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) > __set_current_state(TASK_RUNNING); > return 0; > } > + > +/** > + * blk_poll - poll for IO completions > + * @q: the queue > + * @cookie: cookie passed back at IO submission time > + * @spin: whether to spin for completions > + * > + * Description: > + * Poll for completions on the passed in queue. Returns number of > + * completed entries found. If @spin is true, then blk_poll will continue > + * looping until at least one completion is found, unless the task is > + * otherwise marked running (or we need to reschedule). > + */ > +int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) > +{ > + int ret; > + > + if (!percpu_ref_tryget(&q->q_usage_counter)) > + return 0; > + ret = __blk_poll(q, cookie, spin); > + blk_queue_exit(q); > + > + return ret; > +} IMO, this change isn't required. Caller of blk_poll is supposed to hold refcount of the request queue, then the related hctx data structure won't go away. When the hctx is in transient state, there can't be IO to be polled, and it is safe to call into IO path. BTW, .poll is absolutely the fast path, we should be careful to add code in this path. Thanks, Ming