On 8/17/24 10:50, Yihang Li wrote: > If formatting a suspended disk (such as formatting with different DIF > type), the disk will be resuming first, and then the format command will > submit to the disk through SG_IO ioctl. > > When the disk is processing the format command, the system does not submit > other commands to the disk. Therefore, the system attempts to suspend the > disk again and sends the SYNC CACHE command. However, the SYNC CACHE Why would the system try to suspend the disk with a request in flight ? Sounds like there is a bug with PM reference counting, no ? > command will fail because the disk is in the formatting process, which > will cause the runtime_status of the disk to error and it is difficult > for user to recover it. Error info like: > > [ 669.925325] sd 6:0:6:0: [sdg] Synchronizing SCSI cache > [ 670.202371] sd 6:0:6:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK > [ 670.216300] sd 6:0:6:0: [sdg] Sense Key : 0x2 [current] > [ 670.221860] sd 6:0:6:0: [sdg] ASC=0x4 ASCQ=0x4 > > To solve the issue, retry the command until format command is finished. > > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Yihang Li <liyihang9@xxxxxxxxxx> > Reviewed-by: Bart Van Assche <bvanassche@xxxxxxx> > --- > Changes since v3: > - Add Cc tag for kernel stable. > > Changes since v2: > - Add Reviewed-by for Bart. > > Changes since v1: > - Updated and added error information to the patch description. > > --- > drivers/scsi/sd.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index adeaa8ab9951..5cd88a8eea73 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1823,6 +1823,11 @@ static int sd_sync_cache(struct scsi_disk *sdkp) > (sshdr.asc == 0x74 && sshdr.ascq == 0x71)) /* drive is password locked */ > /* this is no error here */ > return 0; > + > + /* retry if format in progress */ > + if (sshdr.asc == 0x4 && sshdr.ascq == 0x4) > + return -EBUSY; > + > /* > * This drive doesn't support sync and there's not much > * we can do because this is called during shutdown -- Damien Le Moal Western Digital Research