On 4/12/23 18:09, Ming Lei wrote:
On Wed, Apr 12, 2023 at 01:41:23PM -0700, Bart Van Assche wrote:
System shutdown happens as follows (see e.g. the systemd source file
src/shutdown/shutdown.c):
* sync() is called.
* reboot(RB_AUTOBOOT/RB_HALT_SYSTEM/RB_POWER_OFF) is called.
* If the reboot() system call returns, log an error message.
The reboot() system call causes the kernel to call kernel_restart(),
kernel_halt() or kernel_power_off(). Each of these functions calls
device_shutdown(). device_shutdown() calls sd_shutdown().
After sd_shutdown() has been called the .shutdown() callback of the LLD
will be called. This makes it unsafe to submit I/O after sd_shutdown()
has finished.
unsafe often means a bug, can you explain it in details about the 'unsafe'?
In this case that means that a kernel crash could be triggered.
+fail_future_io:
+ q = sdkp->disk->queue;
+ blk_queue_flag_set(QUEUE_FLAG_DYING, q);
+ blk_mq_freeze_queue(q);
+ blk_mq_unfreeze_queue(q);
freeze queue can slow down reboot a lot especially when there are lots of
LUNs/Hosts because each device ->shutdown() is serialized.
I think it can be done by changing sdev state to SDEV_OFFLINE here.
Hi Ming,
Changing the SCSI device state is not sufficient to wait for pending
commands to finish. How about replacing this patch with the patch below?
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 4bb87043e6db..4017b5412ba4 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3699,12 +3699,13 @@ static int sd_start_stop_device(struct scsi_disk
*sdkp, int start)
static void sd_shutdown(struct device *dev)
{
struct scsi_disk *sdkp = dev_get_drvdata(dev);
+ struct request_queue *q;
if (!sdkp)
return; /* this can happen */
if (pm_runtime_suspended(dev))
- return;
+ goto fail_future_io;
if (sdkp->WCE && sdkp->media_present) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
@@ -3715,6 +3716,14 @@ static void sd_shutdown(struct device *dev)
sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n");
sd_start_stop_device(sdkp, 0);
}
+
+fail_future_io:
+ q = sdkp->disk->queue;
+ blk_queue_flag_set(QUEUE_FLAG_DYING, q);
+ if (!scsi_host_busy(sdkp->device->host))
+ return;
+ blk_mq_freeze_queue(q);
+ blk_mq_unfreeze_queue(q);
}
static int sd_suspend_common(struct device *dev, bool