Re: [PATCH 1/2] scsi: Do not rely on blk-mq for double completions

Jens Axboe <axboe@xxxxxxxxx> · Tue, 13 Nov 2018 12:20:46 -0700

On 11/13/18 11:57 AM, Keith Busch wrote:
> The scsi timeout error handling had been directly updating the request
> state to prevent a natural completion and error handling from completing
> the same request twice. Fix this layering violation by having scsi
> control the fate of its commands with scsi owned flags rather than
> use blk-mq's.
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 61babcb269ab..c680171ca201 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1635,8 +1635,18 @@ static blk_status_t scsi_mq_prep_fn(struct request *req)
>  
>  static void scsi_mq_done(struct scsi_cmnd *cmd)
>  {
> +	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> +		return;
>  	trace_scsi_dispatch_cmd_done(cmd);
>  	blk_mq_complete_request(cmd->request);
> +
> +#ifdef CONFIG_FAIL_IO_TIMEOUT
> +	/*
> +	 * Clearing complete here serves only to allow the desired recovery to
> +	 * escalate on blk_rq_should_fake_timeout()'s error injection.
> +	 */
> +	clear_bit(__SCMD_COMPLETE, &cmd->flags);
> +#endif
>  }

We could have this be:

static void scsi_mq_done(struct scsi_cmnd *cmd)
{
	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
		return;
 	trace_scsi_dispatch_cmd_done(cmd);

 	if (blk_mq_complete_request(cmd->request)) {
		/*
		 * Clearing complete here serves only to allow the
		 * desired recovery to escalate on
		 * blk_rq_should_fake_timeout()'s error injection.
		 */
		clear_bit(__SCMD_COMPLETE, &cmd->flags);
	}
}

with

bool blk_mq_complete_request(struct request *rq)
{
	if (unlikely(blk_should_fake_timeout(rq->q)))
		return true;
	__blk_mq_complete_request(rq);
	return false;
}

and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
more expensive.

Was going to suggest a request flag, but the request is gone at this
point. So that won't really work...

I'm with your solution as well, fwiw.

-- 
Jens Axboe