Re: [PATCH v2 1/4] scsi: sd: Let sd_shutdown() fail future I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/19/23 04:34, James Bottomley wrote:
> On Tue, 2023-04-18 at 11:37 -0700, Bart Van Assche wrote:
>> On 4/18/23 07:36, James Bottomley wrote:
>>> On Mon, 2023-04-17 at 16:06 -0700, Bart Van Assche wrote:
>>>> System shutdown happens as follows (see e.g. the systemd source
>>>> file src/shutdown/shutdown.c):
>>>> * sync() is called.
>>>> * reboot(RB_AUTOBOOT/RB_HALT_SYSTEM/RB_POWER_OFF) is called.
>>>> * If the reboot() system call returns, log an error message.
>>>>
>>>> The reboot() system call causes the kernel to call
>>>> kernel_restart(), kernel_halt() or kernel_power_off(). Each of
>>>> these functions calls device_shutdown(). device_shutdown() calls
>>>> sd_shutdown(). After sd_shutdown() has been called the
>>>> .shutdown() callback of the LLD will be called. Hence, I/O
>>>> submitted after sd_shutdown() will hang or may even cause a
>>>> kernel crash.
>>>>
>>>> Let sd_shutdown() fail future I/O such that LLD .shutdown()
>>>> callbacks can be simplified.
>>>
>>> What is the actual reason for this?  What is it you think might be
>>> submitting I/O after the system gets into this state?  Current
>>> sd_shutdown is constructed on the premise that it's the last thing
>>> that ever happens to the device before reboot/power off which is
>>> why it flushes the cache if necessary and stops the device if
>>> required, but for most standard devices neither is required because
>>> we don't expect Linux to go down with pending items in the block
>>> queue and for a write through disk cache anything that's completed
>>> on the block queue is safely durable on the device.
>>
>> Hi James,
>>
>> .shutdown() callbacks should quiesce I/O but the sd_shutdown()
>> function doesn't do this. I see this as a bug.
> 
> Why? They're only called on reboot or shutdown.  In orderly cases, the
> queues should have been stopped long ago, so there should be no I/O to
> quiesce, and in the disorderly case, you obviously didn't care about
> the data anyway, so the job of the shutdown routine is to salvage as
> much as it can, which is why we flush the cache and stop the disk if
> necessary.

A kernel panic has been reported caused by I/O arriving after driver's
shutdown (the driver is fixed now so just scsi exception handling is
logged).
All drivers should have a protection against late commands queued, with
a block after sd.shutdown that could be dropped  and so well written
drivers could enjoy a bit easier/faster code and and those not well
written wouldn't be waiting for issues yet to come.

What actually is the downside of blocking further I/O in sd shutdown?

> 
>> Regarding your question, I think that sd_check_events() can be called
>> while sd_shutdown() is in progress or after sd_shutdown() has
>> finished.  sd_check_events() may submit a TEST UNIT READY command.
> 
> If that's true, that would argue that the block layer caller should be
> aware of the shutdown and not do this.  On the other hand, if the TUR
> fails, what is the harm?
> 
>> In pci_device_shutdown() one can see that the PCI core clears the bus
>> master bit for PCI devices during shutdown. In other words, it is not
>> safe to submit I/O or to process completions during invocation of 
>> shutdown callbacks. I think that also shows that this patch fixes a
>> bug.
> 
> I don't think it does: in the orderly case, there should be no I/O
> outstanding, so nothing to trigger and in the disorderly case, you have
> no expectation of the I/O reaching the device, so what would the actual
> bug be that this fixes?


> 
> James
> 




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux