On 07.08.20 16:30, Alan Stern wrote: > On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote: >> it's really strange: below is the change I'm trying. Of course that's >> only for testing the functionality, nothing how a patch could look like. >> >> While I remember it had worked, now (weirdly since I tried that mounting >> via fstab) it doesn't anymore! >> >> What I understand (not much): I handle the error with "retry" via the >> new flag, but scsi_decide_disposition() returns SUCCESS because of "no >> more retries"; but it's the first and only time it's called. > > Are you saying that scmd->allowed is set to 0? Or is scsi_notry_cmd() > returning a nonzero value? Whichever is true, why does it happen that > way? scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed). why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if (status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true, then at the end it returns 1 because of REQ_FAILFAST_DEV. that seems to come from the block layer. why and when? could I change that so that the scsi error handling stays in control? > > What is the failing command? Is it a READ(10)? Not sure how I'd answer that, but here's the test to trigger the error: mount /dev/sda1 /mnt cd /mnt ls cp file ~/ (if ls "works" and doesn't yet trigger the error) and that's the (familiar looking) logs when doing so. again: despite the mentioned workaround in scsi_error and the new expected_media_change flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems to override what I want to do is scsi_error: [ 55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s [ 55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] [ 55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 [ 55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc e0 00 00 01 00 [ 55.557666] blk_update_request: I/O error, dev sda, sector 589024 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 55.574691] blk_update_request: I/O error, dev sda, sector 589025 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 55.591562] blk_update_request: I/O error, dev sda, sector 589026 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed (... goes on with the same) > >> How can this be? What am I missing? > > It's kind of hard to tell without seeing the error messages or system > log or any debugging information. thanks a lot for getting back to me, martin > > Alan Stern > >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> return NEEDS_RETRY; >> } >> } >> + if (scmd->device->expecting_media_change) { >> + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { >> + scmd->device->expecting_media_change = 0; >> + return NEEDS_RETRY; >> + } >> + } >> + >> /* >> * we might also expect a cc/ua if another LUN on the target >> * reported a UA with an ASC/ASCQ of 3F 0E - >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index d90fefffe31b..bb583e403b81 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) >> if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ >> return 0; >> >> + sdkp->device->expecting_media_change = 1; >> + >> if (!sdkp->device->manage_start_stop) >> return 0; >> >> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h >> index bc5909033d13..f5fc1af68e00 100644 >> --- a/include/scsi/scsi_device.h >> +++ b/include/scsi/scsi_device.h >> @@ -169,6 +169,7 @@ struct scsi_device { >> * this device */ >> unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN >> * because we did a bus reset. */ >> + unsigned expecting_media_change:1; >> unsigned use_10_for_rw:1; /* first try 10-byte read / write */ >> unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ >> unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */