On 09.08.20 17:26, Alan Stern wrote: > On Sun, Aug 09, 2020 at 11:20:22AM +0200, Martin Kepplinger wrote: >> Hey Alan, I'm really glad for that, I suspected some of this but I have >> little experience in scsi/block layers, so that is super helpful. >> >> I'd appreciate an opinion on the below workaround that *seems* to work >> now (let's see, I had thought so before :) >> >> Whether or not this helps to find a real solution, let's see. But >> integration of such a flag in the error handling paths is what's >> interesting for now: >> >> >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> return NEEDS_RETRY; >> } >> } >> + if (scmd->device->expecting_media_change) { >> + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { >> + /* clear expecting_media_change in >> + * scsi_noretry_cmd() because we need >> + * to override possible "failfast" overrides >> + * that block readahead can cause. >> + */ >> + return NEEDS_RETRY; > > This is a somewhat fragile approach. You don't know for certain that > scsi_noretry_cmd will be called. Also, scsi_noretry_cmd can be called > from other places. > > It would be better to clear the expecting_media_change flag just before > returning from scsi_decide_disposition. That way its use is localized > to one routine, not spread out between two. > > Alan Stern > Hi Alan, maybe you're right. I initially just thought that I'd allow for specific error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY, ERROR) despite having the flag set. The below version works equally fine for me but I'm not sure if it's actually more safe. James, when exposing a new writable sysfs option like "suspend_no_media_change"(?) that drivers can check before setting the new "expecting_media_change" flag (during resume), would this addition make sense to you? thanks, martin --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd) return NEEDS_RETRY; } } + if (scmd->device->expecting_media_change) { + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { + /* + * clear the expecting_media_change in + * scsi_decide_disposition() because we + * need to catch possible "fail fast" overrides + * that block readahead can cause. + */ + return NEEDS_RETRY; + } + } + /* * we might also expect a cc/ua if another LUN on the target * reported a UA with an ASC/ASCQ of 3F 0E - @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) * the request was not marked fast fail. Note that above, * even if the request is marked fast fail, we still requeue * for queue congestion conditions (QUEUE_FULL or BUSY) */ - if ((++scmd->retries) <= scmd->allowed - && !scsi_noretry_cmd(scmd)) { - return NEEDS_RETRY; + if ((++scmd->retries) <= scmd->allowed) { + /* + * but scsi_noretry_cmd() cannot override the + * expecting_media_change flag. + */ + if (!scsi_noretry_cmd(scmd) || + scmd->device->expecting_media_change) { + scmd->device->expecting_media_change = 0; + return NEEDS_RETRY; + } else { + /* marked fast fail and not expected. */ + return SUCCESS; + } } else { /* * no more retries - report this one back to upper level. diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..bb583e403b81 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ return 0; + sdkp->device->expecting_media_change = 1; + if (!sdkp->device->manage_start_stop) return 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index bc5909033d13..f5fc1af68e00 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -169,6 +169,7 @@ struct scsi_device { * this device */ unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN * because we did a bus reset. */ + unsigned expecting_media_change:1; unsigned use_10_for_rw:1; /* first try 10-byte read / write */ unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ --