Re: [PATCH-v4 0/5] Fix LUN_RESET active I/O + TMR handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2016-02-10 at 22:53 -0800, Nicholas A. Bellinger wrote:
> On Tue, 2016-02-09 at 18:03 +0000, Himanshu Madhani wrote:
> > On 2/8/16, 9:25 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote:
> > >On Mon, 2016-02-08 at 23:27 +0000, Himanshu Madhani wrote:
> > >> 
> > >> I am testing this series with with 4.5.0-rc2+ kernel and I am seeing
> > >>issue
> > >> where trying to trigger
> > >> sg_reset with option of host/device/bus in loop at 120second interval
> > >> causes call stack. At this point
> > >> removing configuration hangs indefinitely. See attached dmesg output
> > >>from
> > >> my setup. 
> > >> 
> > >
> > >Thanks alot for testing this.
> > >
> > >So It looks like we're still hitting a indefinite schedule() on
> > >se_cmd->cmd_wait_comp once tcm_qla2xxx session disconnect/reconnect
> > >occurs, after repeated explicit active I/O remote-port sg_resets.
> > >
> > >Does this trigger on the first tcm_qla2xxx session reconnect after
> > >explicit remote-port sg_reset..?  Are session reconnects actively being
> > >triggered during the test..?
> > >
> > >To verify the latter for iscsi-target, I've been using a small patch to
> > >trigger session reset from TMR kthread context in order to simulate the
> > >I_T disconnects.  Something like that would be useful for verifying with
> > >tcm_qla2xxx too.
> > >
> > >That said, I'll be reproducing with tcm_qla2xxx ports this week, and
> > >will enable various debug in a WIP branch for testing.
> 
> Following up here..
> 
> So far using my test setup with ISP2532 ports in P2P + RAMDISK_MCP and
> v4.5-rc1, repeated remote-port active I/O LUN_RESET (sg_reset -d) has
> been functioning as expected with a blocksize_range=4k-256k + iodepth=32
> fio write-verify style workload.
> 
> No ->cmd_kref -1 OOPsen or qla2xxx initiator generated ABORT_TASKs from
> outstanding target TAS responses, nor fio write-verify failures to
> report after 800x remote-port active I/O LUN_RESETS.
> 
> Next step will be to verify explicit tcm_qla2xxx port + module shutdown
> after 1K test iterations, and then IBLOCK async completions <-> NVMe
> backends with the same case.
> 

After letting this test run over-night up to 7k active I/O remote-port
LUN_RESETs, things are still functioning as expected.

Also, /etc/init.d/target stop was able to successfully shutdown all
active sessions and unload tcm_qla2xxx after the test run.

So AFAICT, the active I/O remote-port LUN_RESET changes are stable with
tcm_qla2xxx ports, separate from concurrent session disconnect hung task
you reported earlier.

That said, I'll likely push this series as-is for -rc4, given that Dan
has also been able to verify the non conncurrent session disconnect case
on his setup generating constant ABORT_TASKs, and it's still surviving
both cases for iscsi-target ports.

Please give the debug patch from last night a shot, and see if we can
determine the se_cmd states when you hit the hung task.

Thank you,

-nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux