Re: [PATCH V10 11/11] block: deactivate hctx when the hctx is actually inactive

Ming Lei <ming.lei@xxxxxxxxxx> · Mon, 11 May 2020 10:11:33 +0800

On Sat, May 09, 2020 at 07:07:55AM -0700, Bart Van Assche wrote:
> On 2020-05-04 19:09, Ming Lei wrote:
> > @@ -1373,28 +1375,16 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
> >  	int srcu_idx;
> >  
> [ ... ]
> >  	if (!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
> > -		cpu_online(hctx->next_cpu)) {
> > -		printk(KERN_WARNING "run queue from wrong CPU %d, hctx %s\n",
> > -			raw_smp_processor_id(),
> > -			cpumask_empty(hctx->cpumask) ? "inactive": "active");
> > -		dump_stack();
> > +	    cpumask_next_and(-1, hctx->cpumask, cpu_online_mask) >=
> > +	    nr_cpu_ids) {
> > +		blk_mq_hctx_deactivate(hctx);
> > +		return;
> >  	}
> 
> The blk_mq_hctx_deactivate() function calls blk_mq_resubmit_rq()
> indirectly. From blk_mq_resubmit_rq():
> 
> +  /* avoid allocation failure by clearing NOWAIT */
> +  nrq = blk_get_request(rq->q, rq->cmd_flags & ~REQ_NOWAIT, flags);
> 
> blk_get_request() calls blk_mq_alloc_request(). blk_mq_alloc_request()
> calls blk_queue_enter(). blk_queue_enter() waits until a queue is
> unfrozen if freezing of a queue has started. As one can see freezing a
> queue triggers a queue run:
> 
> void blk_freeze_queue_start(struct request_queue *q)
> {
> 	mutex_lock(&q->mq_freeze_lock);
> 	if (++q->mq_freeze_depth == 1) {
> 		percpu_ref_kill(&q->q_usage_counter);
> 		mutex_unlock(&q->mq_freeze_lock);
> 		if (queue_is_mq(q))
> 			blk_mq_run_hw_queues(q, false);
> 	} else {
> 		mutex_unlock(&q->mq_freeze_lock);
> 	}
> }
> 
> Does this mean that if queue freezing happens after hot unplugging
> started that a deadlock will occur because the blk_mq_run_hw_queues()
> call in blk_freeze_queue_start() will wait forever?

Yes, looks there is such issue.

However, the wait forever isn't new with this patch, because all queued
(in scheduler queue or sw queue)request may not be completed after this
hctx becomes inactive.

One simple solution is to pass BLK_MQ_REQ_PREEMPT to blk_get_request()
called in blk_mq_resubmit_rq() because at that time freezing wait won't
return and it is safe to allocate a new request for completing old
requests originated from inactive hctx.

I will do this way in V11.

Thanks,
Ming