Re: [PATCH-v4 0/5] Fix LUN_RESET active I/O + TMR handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Himanshu & Co,

On Tue, 2016-02-09 at 18:03 +0000, Himanshu Madhani wrote:
> Hi Nic, 
> 
> 
> On 2/8/16, 9:25 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote:
> 
> >Hi Himanshu,
> >
> >On Mon, 2016-02-08 at 23:27 +0000, Himanshu Madhani wrote:
> >> 
> >> I am testing this series with with 4.5.0-rc2+ kernel and I am seeing
> >>issue
> >> where trying to trigger
> >> sg_reset with option of host/device/bus in loop at 120second interval
> >> causes call stack. At this point
> >> removing configuration hangs indefinitely. See attached dmesg output
> >>from
> >> my setup. 
> >> 
> >
> >Thanks alot for testing this.
> >
> >So It looks like we're still hitting a indefinite schedule() on
> >se_cmd->cmd_wait_comp once tcm_qla2xxx session disconnect/reconnect
> >occurs, after repeated explicit active I/O remote-port sg_resets.
> >
> >Does this trigger on the first tcm_qla2xxx session reconnect after
> >explicit remote-port sg_reset..?  Are session reconnects actively being
> >triggered during the test..?
> >
> >To verify the latter for iscsi-target, I've been using a small patch to
> >trigger session reset from TMR kthread context in order to simulate the
> >I_T disconnects.  Something like that would be useful for verifying with
> >tcm_qla2xxx too.
> >
> >That said, I'll be reproducing with tcm_qla2xxx ports this week, and
> >will enable various debug in a WIP branch for testing.

Following up here..

So far using my test setup with ISP2532 ports in P2P + RAMDISK_MCP and
v4.5-rc1, repeated remote-port active I/O LUN_RESET (sg_reset -d) has
been functioning as expected with a blocksize_range=4k-256k + iodepth=32
fio write-verify style workload.

No ->cmd_kref -1 OOPsen or qla2xxx initiator generated ABORT_TASKs from
outstanding target TAS responses, nor fio write-verify failures to
report after 800x remote-port active I/O LUN_RESETS.

Next step will be to verify explicit tcm_qla2xxx port + module shutdown
after 1K test iterations, and then IBLOCK async completions <-> NVMe
backends with the same case.

> Let me know if I can help in any way for testing/validating this series.
> 

Thanks.  :)

So based on your original log, it's still unclear clear if the session
reset resulting in se_cmd->cmd_wait_comp indefinite sleep + ->cmd_kref
leak is happen concurrently with repeated remote port LUN_RESET, or the
session reset -> target_wait_for_sess_cmds() occurs after active I/O has
already completed..?  Please confirm.

To that end, target-pending/debug-for-himanshu has been pushed to enable
extra debug for test, please update.

Also, you'll want to enable microsecond ring buffer timestamps in your
kernel build too, as it's very useful for type this debugging.

Thank you,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux