Re: [PATCH v1] mpt3sas: Fix calltrace observed while running IO & host reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 20, 2018 at 11:43 PM, Bart Van Assche
<Bart.VanAssche@xxxxxxx> wrote:
> On Wed, 2018-06-20 at 09:18 +0530, Chaitra Basappa wrote:
>> We have tried with calling scsi_internal_device_block_nowait() API before
>> doing IOC reset (i.e. host reset) and called
>> scsi_internal_device_unblock_nowait() after performing IOC reset.
>> We have tested this code change with various test cases such as
>> adding/removing target drives or expanders during diag reset with and
>> without IOs and at high level we see all are working but we observe below
>> error messages while performing hibernation operation,
>>
>> sd 1:0:0:0: device_block, handle(0x0028)
>> BRCM Debug: sdev->sdev_state: 5 before device_block_nowait
>> BRCM Debug: sdev->sdev_state: 5 after_device_block_nowait
>> sd 1:0:0:0: device_block failed with return(-22) for handle(0x0028)
>> .
>> .
>> sd 0:0:0:0: device_unblock and setting to running, handle(0x0028)
>> sd 0:0:0:0: device_unblock failed with return(-22) for handle(0x0028)
>> performing a block followed by an unblock
>> sd 0:0:0:0: retried device_block failed with return(-22) for handle(0x0028)
>> sd 0:0:0:0: retried device_unblock failed with return(-22) for
>> handle(0x0028)
>>
>> We are observing these messages during of system resume time, during which
>> driver issues IOC reset operation in the .resume() callback function.
>> In the above error messages we see that drives are in SDEV_QUIESCE state.
>> When drives are SDEV_QUIESCE state then moving these drives to
>> SDEV_BLOCK state is not allowed and hence we observe above error messages.
>>
>> SDEV_QUIESCE state means that Device quiescent. No block commands will be
>> accepted, only specials (which originate in the midlayer).
>
> Neither scsi_internal_device_block_nowait() nor
> scsi_internal_device_unblock_nowait() should ever have been changed from
> static into exported functions. But that's another discussion. Regarding the
> adverse interaction of scsi_internal_device_block_nowait() and
> scsi_internal_device_unblock_nowait() with the power management code, have
> you considered to surround code that blocks and unblocks SCSI devices with
> lock_system_sleep() / unlock_system_sleep() to avoid that these functions
> fail with error code -22?
>

Bart, we tried using lock_system_sleep() before calling IOC reset
operation in .resume() callback function and unlock_system_sleep()
after the IOC reset. With this code change we see system is going to
hang state during hibernation and we just see below messages,

[  625.788598] PM: hibernation entry
Jun 21 05:37:33 localhost kernel: PM: hibernation entry
[  627.428159] PM: Syncing filesystems ...
Jun 21 05:37:34 localhost kernel: PM: Syncing filesystems ...
[  628.756119] PM: done.
[  628.758707] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  628.768340] OOM killer disabled.
[  628.772010] PM: Preallocating image memory... done (allocated 197704 pages)
[  632.554470] PM: Allocated 790816 kbytes in 3.77 seconds (209.76 MB/s)
[  632.561664] Freezing remaining freezable tasks ... (elapsed 0.002
seconds) done.
[  632.572269] Suspending console(s) (use no_console_suspend to debug)


The fix which we have posted looks simple and we don't see any side
effects of it.
We have done complete regression testing on our fix and we don't see
any issue with it. So please consider our fix which have posted.

Thanks,
Sreekanth


> Thanks,
>
> Bart.
>
>
>



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux