On Sun, 2012-05-06 at 18:31 +0200, Henning Becker wrote: > Am Samstag, 14. April 2012, 14:37:47 schrieb Nicholas A. Bellinger: > > On Sat, 2012-04-14 at 18:35 +0200, Henning Becker wrote: > > > Am Dienstag, 10. April 2012, 23:54:06 schrieb Nicholas A. Bellinger: <SNIP> > > Hi Henning, > > > > Ok, I think I've identified the cause of this oops within iscsi-target. > > > > It has to do with the ordering in which your scripts are tearing down > > the configfs layout. Looking at the inotify log again I see the > > following ordering: > > > > # Tear down LUN=0 from TPG=1 > > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_1 > > /lun/lun_0/ DELETE,ISDIR statistics > > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_ > > 1/lun/lun_0/ DELETE_SELF > > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_ > > 1/lun/ DELETE,ISDIR lun_0 > > > > # Release IBLOCK backend device > > /sys/kernel/config/target/core/iblock_0/iscsiLUNTest/ DELETE_SELF > > /sys/kernel/config/target/core/iblock_0/ DELETE,ISDIR iscsiLUNTest > > > > # Echo '0 > enable' to disable TPG > > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/ > > CLOSE_NOWRITE,CLOSE,ISDIR > > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_ > > 1/ MODIFY enable > > /sys/kernel/config/target/iscsi/iqn.2012-04.lan.storage:iscsi.storage/tpgt_ > > 1/ OPEN enable > > > > So it appears with your custom scripts that the LUN=0 + IBLOCK backend > > is being released *before* explicitly disabling the TPG and forcing all > > of the active sessions to shutdown. > > > > The OOPs itself is being caused by the removal of the IBLOCK backend, as > > there is code in the iscsi_cmd descriptor release path that depends upon > > the backend being in place (although removing the TPG LUN is OK).. This > > is a genuine bug, for which I'll need to think some more to best resolve > > in order to avoid extra overhead within the existing data I/O fast > > path.. > > > > That said, the work-around for this bug is to change your custom scripts > > to follow what rtslib/lio-utils currently does for TPG removal. That > > is: > > > > 1: Echo '0 > enable' to disable TPG > > 2: Tear down NodeACLs+MappedLUNs from TPG > > 3: Tear down LUN from TPG > > 4: Tear down entire TPG > > 4: Release IBLOCK backend device > > Hi Nicholas, > I'm just running my cluster according to your specs for 3 weeks now and the > problem has not occured anymore. :-) > > Hi Henning, Thanks for confirmation that the backend device shutdown ordering is the root cause trigger for the bug you've seen.. As mentioned, I still need to think some more about what the proper resolution should actually be here.. > > I'm quite certain this will avoid the bug in question by forcing > > shutdown of all active sessions at step #1, instead of doing this part > > at the end of the sequence as done in your current setup. > > > > Please give it a shot and let me know if you have problems getting your > > scripts to sync with what the official userspace code is doing here. > > Which official userspace code does that? I'm currently just calling lio_node > and it didn't refuse me, to release an iblock which is still connected to a > portal. > What I meant here is that the important part is currently disabling the TPG before bringing down the TPG LUN associated with the backend with active IO, ahead of the backend itself. This will shutdown all active iSCSI sessions (and hence outstanding I/Os) to underlying backend devices, and after it's completed it will be safe to remove an associated backend device. So the main issue is still the final release of the backend device (after it's been released from TPG LUN) to ensure that any remaining outstanding I/O that is still referencing se_device memory is allowed to complete before 'rmdir /sys/kernel/config/target/core/$HBA/$DEV' is releasing se_device. --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html