This patch restores the behavior of the following algorithm from the legacy block layer: - Before completing a request, test-and-set REQ_ATOM_COMPLETE atomically. Only call the block driver completion function if that flag was not yet set. - Before calling the block driver timeout function, test-and-set REQ_ATOM_COMPLETE atomically. Only call the timeout handler if that flag was not yet set. If that flag was already set, do not restart the timer. Cc: Keith Busch <kbusch@xxxxxxxxxx> Reported-by: Adrian Hunter <adrian.hunter@xxxxxxxxx> Fixes: 065990bd198e ("scsi: set timed out out mq requests to complete") Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> --- drivers/scsi/scsi_error.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 9cb0f9df621a..cd05f2db3339 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -331,6 +331,14 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) enum blk_eh_timer_return rtn = BLK_EH_DONE; struct Scsi_Host *host = scmd->device->host; + /* + * scsi_done() may be called concurrently with scsi_times_out(). Only + * one of these two functions should proceed. Hence return early if + * scsi_done() won the race. + */ + if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state)) + return BLK_EH_DONE; + trace_scsi_dispatch_cmd_timeout(scmd); scsi_log_completion(scmd, TIMEOUT_ERROR); @@ -341,20 +349,6 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) rtn = host->hostt->eh_timed_out(scmd); if (rtn == BLK_EH_DONE) { - /* - * Set the command to complete first in order to prevent a real - * completion from releasing the command while error handling - * is using it. If the command was already completed, then the - * lower level driver beat the timeout handler, and it is safe - * to return without escalating error recovery. - * - * If timeout handling lost the race to a real completion, the - * block layer may ignore that due to a fake timeout injection, - * so return RESET_TIMER to allow error handling another shot - * at this command. - */ - if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state)) - return BLK_EH_RESET_TIMER; if (scsi_abort_command(scmd) != SUCCESS) { set_host_byte(scmd, DID_TIME_OUT); scsi_eh_scmd_add(scmd);