On Tue, Nov 13, 2018 at 12:20:46PM -0700, Jens Axboe wrote: > On 11/13/18 11:57 AM, Keith Busch wrote: > > static void scsi_mq_done(struct scsi_cmnd *cmd) > > { > > + if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)) > > + return; > > trace_scsi_dispatch_cmd_done(cmd); > > blk_mq_complete_request(cmd->request); > > + > > +#ifdef CONFIG_FAIL_IO_TIMEOUT > > + /* > > + * Clearing complete here serves only to allow the desired recovery to > > + * escalate on blk_rq_should_fake_timeout()'s error injection. > > + */ > > + clear_bit(__SCMD_COMPLETE, &cmd->flags); > > +#endif > > } > > We could have this be: > > static void scsi_mq_done(struct scsi_cmnd *cmd) > { > if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)) > return; > trace_scsi_dispatch_cmd_done(cmd); > > if (blk_mq_complete_request(cmd->request)) { > /* > * Clearing complete here serves only to allow the > * desired recovery to escalate on > * blk_rq_should_fake_timeout()'s error injection. > */ > clear_bit(__SCMD_COMPLETE, &cmd->flags); > } > } > > with > > bool blk_mq_complete_request(struct request *rq) > { > if (unlikely(blk_should_fake_timeout(rq->q))) > return true; > __blk_mq_complete_request(rq); > return false; > } > > and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit > more expensive. I was trying to avoid every cost no matter how negligable (those are the only types of costs left as far as I can see), but I think your proposal might actually be necessary: if a timeout wasn't faked, clearing the completion flag unconditionally might have a problem with a real timeout racing with the real completion. :(