Re: [PATCH V11 11/12] blk-mq: re-submit IO in case that hctx is inactive

Bart Van Assche <bvanassche@xxxxxxx> · Wed, 13 May 2020 08:03:13 -0700

On 2020-05-13 05:21, Christoph Hellwig wrote:
> Use of the BLK_MQ_REQ_FORCE is pretty bogus here..
> 
>> +	if (rq->rq_flags & RQF_PREEMPT)
>> +		flags |= BLK_MQ_REQ_PREEMPT;
>> +	if (reserved)
>> +		flags |= BLK_MQ_REQ_RESERVED;
>> +	/*
>> +	 * Queue freezing might be in-progress, and wait freeze can't be
>> +	 * done now because we have request not completed yet, so mark this
>> +	 * allocation as BLK_MQ_REQ_FORCE for avoiding this allocation &
>> +	 * freeze hung forever.
>> +	 */
>> +	flags |= BLK_MQ_REQ_FORCE;
>> +
>> +	/* avoid allocation failure by clearing NOWAIT */
>> +	nrq = blk_get_request(rq->q, rq->cmd_flags & ~REQ_NOWAIT, flags);
>> +	if (!nrq)
>> +		return;
> 
> blk_get_request returns an ERR_PTR.
> 
> But I'd rather avoid the magic new BLK_MQ_REQ_FORCE hack when we can
> just open code it and document what is going on:
> 
> static struct blk_mq_tags *blk_mq_rq_tags(struct request *rq)
> {
> 	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
> 
> 	if (rq->q->elevator)
> 		return hctx->sched_tags;
> 	return hctx->tags;
> }
> 
> static void blk_mq_resubmit_rq(struct request *rq)
> {
> 	struct blk_mq_alloc_data alloc_data = {
> 		.cmd_flags	= rq->cmd_flags & ~REQ_NOWAIT;
> 	};
> 	struct request *nrq;
> 
> 	if (rq->rq_flags & RQF_PREEMPT)
> 		alloc_data.flags |= BLK_MQ_REQ_PREEMPT;
> 	if (blk_mq_tag_is_reserved(blk_mq_rq_tags(rq), rq->internal_tag))
> 		alloc_data.flags |= BLK_MQ_REQ_RESERVED;
> 
> 	/*
> 	 * We must still be able to finish a resubmission due to a hotplug
> 	 * even even if a queue freeze is in progress.
> 	 */
> 	percpu_ref_get(&q->q_usage_counter);
> 	nrq = blk_mq_get_request(rq->q, NULL, &alloc_data);
> 	blk_queue_exit(q);
> 
> 	if (!nrq)
> 		return; // XXX: warn?
> 	if (nrq->q->mq_ops->initialize_rq_fn)
> 		rq->mq_ops->initialize_rq_fn(nrq);
> 
> 	blk_rq_copy_request(nrq, rq);
> 	...

I don't like this because the above code allows allocation of requests
and tags while a request queue is frozen. I'm concerned that this will
break code that assumes that no tags are allocated while a request queue
is frozen. If a request queue has a single hardware queue with 64 tags,
if the above code allocates tag 40 and if blk_mq_tag_update_depth()
reduces the queue depth to 32, will nrq become a dangling pointer?

Thanks,

Bart.