On 4/19/23 04:34, James Bottomley wrote: > On Tue, 2023-04-18 at 11:37 -0700, Bart Van Assche wrote: >> On 4/18/23 07:36, James Bottomley wrote: >>> On Mon, 2023-04-17 at 16:06 -0700, Bart Van Assche wrote: >>>> System shutdown happens as follows (see e.g. the systemd source >>>> file src/shutdown/shutdown.c): >>>> * sync() is called. >>>> * reboot(RB_AUTOBOOT/RB_HALT_SYSTEM/RB_POWER_OFF) is called. >>>> * If the reboot() system call returns, log an error message. >>>> >>>> The reboot() system call causes the kernel to call >>>> kernel_restart(), kernel_halt() or kernel_power_off(). Each of >>>> these functions calls device_shutdown(). device_shutdown() calls >>>> sd_shutdown(). After sd_shutdown() has been called the >>>> .shutdown() callback of the LLD will be called. Hence, I/O >>>> submitted after sd_shutdown() will hang or may even cause a >>>> kernel crash. >>>> >>>> Let sd_shutdown() fail future I/O such that LLD .shutdown() >>>> callbacks can be simplified. >>> >>> What is the actual reason for this? What is it you think might be >>> submitting I/O after the system gets into this state? Current >>> sd_shutdown is constructed on the premise that it's the last thing >>> that ever happens to the device before reboot/power off which is >>> why it flushes the cache if necessary and stops the device if >>> required, but for most standard devices neither is required because >>> we don't expect Linux to go down with pending items in the block >>> queue and for a write through disk cache anything that's completed >>> on the block queue is safely durable on the device. >> >> Hi James, >> >> .shutdown() callbacks should quiesce I/O but the sd_shutdown() >> function doesn't do this. I see this as a bug. > > Why? They're only called on reboot or shutdown. In orderly cases, the > queues should have been stopped long ago, so there should be no I/O to > quiesce, and in the disorderly case, you obviously didn't care about > the data anyway, so the job of the shutdown routine is to salvage as > much as it can, which is why we flush the cache and stop the disk if > necessary. A kernel panic has been reported caused by I/O arriving after driver's shutdown (the driver is fixed now so just scsi exception handling is logged). All drivers should have a protection against late commands queued, with a block after sd.shutdown that could be dropped and so well written drivers could enjoy a bit easier/faster code and and those not well written wouldn't be waiting for issues yet to come. What actually is the downside of blocking further I/O in sd shutdown? > >> Regarding your question, I think that sd_check_events() can be called >> while sd_shutdown() is in progress or after sd_shutdown() has >> finished. sd_check_events() may submit a TEST UNIT READY command. > > If that's true, that would argue that the block layer caller should be > aware of the shutdown and not do this. On the other hand, if the TUR > fails, what is the harm? > >> In pci_device_shutdown() one can see that the PCI core clears the bus >> master bit for PCI devices during shutdown. In other words, it is not >> safe to submit I/O or to process completions during invocation of >> shutdown callbacks. I think that also shows that this patch fixes a >> bug. > > I don't think it does: in the orderly case, there should be no I/O > outstanding, so nothing to trigger and in the disorderly case, you have > no expectation of the I/O reaching the device, so what would the actual > bug be that this fixes? > > James >