On 11/13/18 12:45 PM, Keith Busch wrote: > On Tue, Nov 13, 2018 at 12:20:46PM -0700, Jens Axboe wrote: >> On 11/13/18 11:57 AM, Keith Busch wrote: >>> static void scsi_mq_done(struct scsi_cmnd *cmd) >>> { >>> + if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)) >>> + return; >>> trace_scsi_dispatch_cmd_done(cmd); >>> blk_mq_complete_request(cmd->request); >>> + >>> +#ifdef CONFIG_FAIL_IO_TIMEOUT >>> + /* >>> + * Clearing complete here serves only to allow the desired recovery to >>> + * escalate on blk_rq_should_fake_timeout()'s error injection. >>> + */ >>> + clear_bit(__SCMD_COMPLETE, &cmd->flags); >>> +#endif >>> } >> >> We could have this be: >> >> static void scsi_mq_done(struct scsi_cmnd *cmd) >> { >> if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)) >> return; >> trace_scsi_dispatch_cmd_done(cmd); >> >> if (blk_mq_complete_request(cmd->request)) { >> /* >> * Clearing complete here serves only to allow the >> * desired recovery to escalate on >> * blk_rq_should_fake_timeout()'s error injection. >> */ >> clear_bit(__SCMD_COMPLETE, &cmd->flags); >> } >> } >> >> with >> >> bool blk_mq_complete_request(struct request *rq) >> { >> if (unlikely(blk_should_fake_timeout(rq->q))) >> return true; >> __blk_mq_complete_request(rq); >> return false; >> } >> >> and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit >> more expensive. > > I was trying to avoid every cost no matter how negligable (those are the > only types of costs left as far as I can see), but I think your proposal > might actually be necessary: if a timeout wasn't faked, clearing the > completion flag unconditionally might have a problem with a real timeout > racing with the real completion. :( Yep, I think you are right... No way around it then. Are you going to resend it with the fix? -- Jens Axboe