On 11/13/18 11:57 AM, Keith Busch wrote: > The scsi timeout error handling had been directly updating the request > state to prevent a natural completion and error handling from completing > the same request twice. Fix this layering violation by having scsi > control the fate of its commands with scsi owned flags rather than > use blk-mq's. > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 61babcb269ab..c680171ca201 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1635,8 +1635,18 @@ static blk_status_t scsi_mq_prep_fn(struct request *req) > > static void scsi_mq_done(struct scsi_cmnd *cmd) > { > + if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)) > + return; > trace_scsi_dispatch_cmd_done(cmd); > blk_mq_complete_request(cmd->request); > + > +#ifdef CONFIG_FAIL_IO_TIMEOUT > + /* > + * Clearing complete here serves only to allow the desired recovery to > + * escalate on blk_rq_should_fake_timeout()'s error injection. > + */ > + clear_bit(__SCMD_COMPLETE, &cmd->flags); > +#endif > } We could have this be: static void scsi_mq_done(struct scsi_cmnd *cmd) { if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags)) return; trace_scsi_dispatch_cmd_done(cmd); if (blk_mq_complete_request(cmd->request)) { /* * Clearing complete here serves only to allow the * desired recovery to escalate on * blk_rq_should_fake_timeout()'s error injection. */ clear_bit(__SCMD_COMPLETE, &cmd->flags); } } with bool blk_mq_complete_request(struct request *rq) { if (unlikely(blk_should_fake_timeout(rq->q))) return true; __blk_mq_complete_request(rq); return false; } and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit more expensive. Was going to suggest a request flag, but the request is gone at this point. So that won't really work... I'm with your solution as well, fwiw. -- Jens Axboe