Re: [PATCH v2 1/4] scsi: sd: Let sd_shutdown() fail future I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2023-04-18 at 11:37 -0700, Bart Van Assche wrote:
> On 4/18/23 07:36, James Bottomley wrote:
> > On Mon, 2023-04-17 at 16:06 -0700, Bart Van Assche wrote:
> > > System shutdown happens as follows (see e.g. the systemd source
> > > file src/shutdown/shutdown.c):
> > > * sync() is called.
> > > * reboot(RB_AUTOBOOT/RB_HALT_SYSTEM/RB_POWER_OFF) is called.
> > > * If the reboot() system call returns, log an error message.
> > > 
> > > The reboot() system call causes the kernel to call
> > > kernel_restart(), kernel_halt() or kernel_power_off(). Each of
> > > these functions calls device_shutdown(). device_shutdown() calls
> > > sd_shutdown(). After sd_shutdown() has been called the
> > > .shutdown() callback of the LLD will be called. Hence, I/O
> > > submitted after sd_shutdown() will hang or may even cause a
> > > kernel crash.
> > > 
> > > Let sd_shutdown() fail future I/O such that LLD .shutdown()
> > > callbacks can be simplified.
> > 
> > What is the actual reason for this?  What is it you think might be
> > submitting I/O after the system gets into this state?  Current
> > sd_shutdown is constructed on the premise that it's the last thing
> > that ever happens to the device before reboot/power off which is
> > why it flushes the cache if necessary and stops the device if
> > required, but for most standard devices neither is required because
> > we don't expect Linux to go down with pending items in the block
> > queue and for a write through disk cache anything that's completed
> > on the block queue is safely durable on the device.
> 
> Hi James,
> 
> .shutdown() callbacks should quiesce I/O but the sd_shutdown()
> function doesn't do this. I see this as a bug.

Why? They're only called on reboot or shutdown.  In orderly cases, the
queues should have been stopped long ago, so there should be no I/O to
quiesce, and in the disorderly case, you obviously didn't care about
the data anyway, so the job of the shutdown routine is to salvage as
much as it can, which is why we flush the cache and stop the disk if
necessary.

> Regarding your question, I think that sd_check_events() can be called
> while sd_shutdown() is in progress or after sd_shutdown() has
> finished.  sd_check_events() may submit a TEST UNIT READY command.

If that's true, that would argue that the block layer caller should be
aware of the shutdown and not do this.  On the other hand, if the TUR
fails, what is the harm?

> In pci_device_shutdown() one can see that the PCI core clears the bus
> master bit for PCI devices during shutdown. In other words, it is not
> safe to submit I/O or to process completions during invocation of 
> shutdown callbacks. I think that also shows that this patch fixes a
> bug.

I don't think it does: in the orderly case, there should be no I/O
outstanding, so nothing to trigger and in the disorderly case, you have
no expectation of the I/O reaching the device, so what would the actual
bug be that this fixes?

James




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux