Hi Benjamin, On Thu, 2013-07-11 at 18:41 +0100, Benjamin ESTRABAUD wrote: > Hi! > > I've come across a strange kernel panic issue on LIO on Linux 3.6.11 > (b2824f4e0990716407b0c0e7acee75bb6353febf). The issue seems to be linked > with late IOs calling back to an already deleted target and loops here > until the kernel panics. > The test setup was the following: > > - Exporting more than one IBLOCK (here two MD RAIDs, but seems to do the > same with any block devices) over iSCSI. > - Running intensive IOs (64 ios depth, 548k ios, 100% write, 100% > random) using IO meter from a single fast host > > While the above IOs are running, when "deleting" the iSCSI targets using > rtslib, and by intermittence, the kernel panicked. Please provide the specific rtslib calls required in order to trigger this bug. Also, a quick dump of your top-level targetcli object tree would also be helpful for reference. > The issue is quite intermittent and might happen about 1 out of 20 tests > or less. > We have not yet tried to reproduce the issue on Ramdisk or file > backstores but it seems that this could be linked to the async and slow > nature of the iblock backstores. In fact, the issue usually happens > always after 100ms of tearing off the target, which could indicate that > a BIO comes back and causes the issue. > FYI, the backend type should not make any difference here. > Attached is the kernel trace we get when running this test. > > We are going to try to rollback our kernel/update but we were hoping to > stick on the 3.6.y branch for now. > > Have you seen this issue before? Do you know if it has been resolved in > a later version? Mmmmm, this smells very much like this v3.6.x specific bug: target: Fix missing CMD_T_ACTIVE bit regression for pending WRITEs https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/target/target_core_transport.c?id=e627c615553a356f6f70215ebb3933c6e057553e Because the bugfix was merged after v3.6.y stable support was discontinued, v3.6.11 does *not* contain this fix. Also, two other patches in the same bugfix series are not present in v3.6.11 code: target: Fix use-after-free in LUN RESET handling https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/target/target_core_transport.c?id=72b59d6ee8adaa51f70377db0a1917ed489bead8 and target: Release se_cmd when LUN lookup fails for TMR https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/target/target_core_transport.c?id=5a3b6fc0092c5f8dee7820064ee54d2631d48573 So that said, I'd recommend applying these three patches to your local v3.6.11 tree, or consider moving to >= v3.7.10. Thanks, --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html