On 2023/09/22 11:14, Bart Van Assche wrote: > On 9/21/23 11:07, Damien Le Moal wrote: >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index 1d106c8ad5af..d86306d42445 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -3727,7 +3727,8 @@ static int sd_remove(struct device *dev) >> >> device_del(&sdkp->disk_dev); >> del_gendisk(sdkp->disk); >> - sd_shutdown(dev); >> + if (sdkp->device->sdev_state == SDEV_RUNNING) >> + sd_shutdown(dev); >> >> put_disk(sdkp->disk); >> return 0; > > Doesn't this patch involve a behavior change? I'm concerned the above patch > will always cause the sd_shutdown() call to be skipped if scsi_remove_host() > is called, whether or not the scsi_remove_host() call is triggered by a > resume failure. I guess the point is: what are the possible states of the scsi device when sd_remove() is called ? From what I have seen so far, it is RUNNING. But I am not 100% sure this true all the time. But given that sd_shutdown() issues commands, I would guess that it makes sense that it is always running. Looking at the code, scsi_remove_host() calls scsi_forget_host() which calls __scsi_remove_device() for any device that is not in the SDEV_DEL state. __scsi_remove_device() then sets the state to SDEV_CANCEL. So it appears that the state should always be CANCEL and not running. However, my tests showed it to be running. I am not fully understanding how sd_remove() end up being called... I will run tests again without this patch for libata suspend/resume. This patch was added because of the hang I saw with the pm80xx driver (using libsas/libata) failing to resume. That resume failure is fixed and libata does not need this patch I think. But I will double check and drop it for now. I think we should investigate this further though, to make sure that we can always safely call sd_shutdown(). __scsi_remove_device() has this comment: /* * If blocked, we go straight to DEL and restart the queue so * any commands issued during driver shutdown (like sync * cache) are errored immediately. */ Which kind of give a hint that we should probably not blindy always try to call sd_shutdown(). -- Damien Le Moal Western Digital Research