Re: Kernel Oops while closing iSCSI connection [transport_free_dev_tasks]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2012-05-06 at 18:31 +0200, Henning Becker wrote:
> Am Samstag, 14. April 2012, 14:37:47 schrieb Nicholas A. Bellinger:
> > On Sat, 2012-04-14 at 18:35 +0200, Henning Becker wrote:
> > > Am Dienstag, 10. April 2012, 23:54:06 schrieb Nicholas A. Bellinger:

<SNIP>

> > Hi Henning,
> > 
> > Ok, I think I've identified the cause of this oops within iscsi-target.
> > 
> > It has to do with the ordering in which your scripts are tearing down
> > the configfs layout.  Looking at the inotify log again I see the
> > following ordering:
> > 
> > # Tear down LUN=0 from TPG=1
> > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_1
> > /lun/lun_0/ DELETE,ISDIR statistics
> > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_
> > 1/lun/lun_0/ DELETE_SELF
> > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_
> > 1/lun/ DELETE,ISDIR lun_0
> > 
> > # Release IBLOCK backend device
> > /sys/kernel/config/target/core/iblock_0/iscsiLUNTest/ DELETE_SELF
> > /sys/kernel/config/target/core/iblock_0/ DELETE,ISDIR iscsiLUNTest
> > 
> > # Echo '0 > enable' to disable TPG
> > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/
> > CLOSE_NOWRITE,CLOSE,ISDIR
> > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_
> > 1/ MODIFY enable
> > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_
> > 1/ OPEN enable
> > 
> > So it appears with your custom scripts that the LUN=0 + IBLOCK backend
> > is being released *before* explicitly disabling the TPG and forcing all
> > of the active sessions to shutdown.
> > 
> > The OOPs itself is being caused by the removal of the IBLOCK backend, as
> > there is code in the iscsi_cmd descriptor release path that depends upon
> > the backend being in place (although removing the TPG LUN is OK)..  This
> > is a genuine bug, for which I'll need to think some more to best resolve
> > in order to avoid extra overhead within the existing data I/O fast
> > path..
> > 
> > That said, the work-around for this bug is to change your custom scripts
> > to follow what rtslib/lio-utils currently does for TPG removal.  That
> > is:
> > 
> > 1: Echo '0 > enable' to disable TPG
> > 2: Tear down NodeACLs+MappedLUNs from TPG
> > 3: Tear down LUN from TPG
> > 4: Tear down entire TPG
> > 4: Release IBLOCK backend device
> 
> Hi Nicholas,
> I'm just running my cluster according to your specs for 3 weeks now and the 
> problem has not occured anymore. :-)
> > 

Hi Henning,

Thanks for confirmation that the backend device shutdown ordering is the
root cause trigger for the bug you've seen..  As mentioned, I still need
to think some more about what the proper resolution should actually be
here..

> > I'm quite certain this will avoid the bug in question by forcing
> > shutdown of all active sessions at step #1, instead of doing this part
> > at the end of the sequence as done in your current setup.
> > 
> > Please give it a shot and let me know if you have problems getting your
> > scripts to sync with what the official userspace code is doing here.
> 
> Which official userspace code does that? I'm currently just calling lio_node 
> and it didn't refuse me, to release an iblock which is still connected to a 
> portal.
> 

What I meant here is that the important part is currently disabling the
TPG before bringing down the TPG LUN associated with the backend with
active IO, ahead of the backend itself.  This will shutdown all active
iSCSI sessions (and hence outstanding I/Os) to underlying backend
devices, and after it's completed it will be safe to remove an
associated backend device.

So the main issue is still the final release of the backend device
(after it's been released from TPG LUN) to ensure that any remaining
outstanding I/O that is still referencing se_device memory is allowed to
complete before 'rmdir /sys/kernel/config/target/core/$HBA/$DEV' is
releasing se_device.

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux