Re: [PATCH v3 02/17] scsi: core: Fix a race between scsi_done() and scsi_times_out()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 30, 2021 at 03:33:09PM -0800, Bart Van Assche wrote:
> This patch restores the behavior of the following algorithm from the legacy
> block layer:
> - Before completing a request, test-and-set REQ_ATOM_COMPLETE atomically.
>   Only call the block driver completion function if that flag was not yet
>   set.
> - Before calling the block driver timeout function, test-and-set
>   REQ_ATOM_COMPLETE atomically. Only call the timeout handler if that flag
>   was not yet set. If that flag was already set, do not restart the timer.
> 
> Cc: Keith Busch <kbusch@xxxxxxxxxx>
> Reported-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Fixes: 065990bd198e ("scsi: set timed out out mq requests to complete")
> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
> ---
>  drivers/scsi/scsi_error.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 9cb0f9df621a..cd05f2db3339 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -331,6 +331,14 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>  	enum blk_eh_timer_return rtn = BLK_EH_DONE;
>  	struct Scsi_Host *host = scmd->device->host;
>  
> +	/*
> +	 * scsi_done() may be called concurrently with scsi_times_out(). Only
> +	 * one of these two functions should proceed. Hence return early if
> +	 * scsi_done() won the race.
> +	 */
> +	if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
> +		return BLK_EH_DONE;
> +

If the the timeout handler successfully sets the state to complete, and
the lld returns BLK_EH_RESET_TIMER, who gets to complete this command?

>  	trace_scsi_dispatch_cmd_timeout(scmd);
>  	scsi_log_completion(scmd, TIMEOUT_ERROR);
>  
> @@ -341,20 +349,6 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>  		rtn = host->hostt->eh_timed_out(scmd);
>  
>  	if (rtn == BLK_EH_DONE) {
> -		/*
> -		 * Set the command to complete first in order to prevent a real
> -		 * completion from releasing the command while error handling
> -		 * is using it. If the command was already completed, then the
> -		 * lower level driver beat the timeout handler, and it is safe
> -		 * to return without escalating error recovery.
> -		 *
> -		 * If timeout handling lost the race to a real completion, the
> -		 * block layer may ignore that due to a fake timeout injection,
> -		 * so return RESET_TIMER to allow error handling another shot
> -		 * at this command.
> -		 */
> -		if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
> -			return BLK_EH_RESET_TIMER;
>  		if (scsi_abort_command(scmd) != SUCCESS) {
>  			set_host_byte(scmd, DID_TIME_OUT);
>  			scsi_eh_scmd_add(scmd);



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux