On Tue, 2017-02-07 at 08:12 -0800, Nicholas A. Bellinger wrote: > On Tue, 2017-02-07 at 11:59 +0100, David Disseldorp wrote: > > On Mon, 6 Feb 2017 19:35:47 +0000, Bart Van Assche wrote: > > > > > On Mon, 2017-02-06 at 17:44 +0100, David Disseldorp wrote: > > > > FWIW, this was configured using the script at: > > > > https://github.com/ddiss/rapido/blob/master/lio_local_autorun.sh > > > > > > Hello David, > > > > > > Thanks for having provided that script, that's very helpful. I ran that script > > > after I had entered the following: > > > > > > _fatal() { > > > exit 1 > > > } > > > > > > DYN_DEBUG_MODULES= > > > DYN_DEBUG_FILES= > > > INITIATOR_IQNS=" > > > iqn.2007-10.com.github:sahlberg:libiscsi:iscsi-test > > > iqn.2007-10.com.github:sahlberg:libiscsi:iscsi-test-2 > > > " > > > TARGET_IQN=tgt1 > > > IP_ADDR1=$(ip addr show dev eth0 | sed -n 's,^[[:blank:]]*inet \([^/]*\)/.*$,\1,p') > > > MAC_ADDR1= > > > IP_ADDR2= > > > MAC_ADDR2=foobar > > > > > > Next, I ran the two libiscsi tests mentioned earlier: > > > > > > for ((i=0;i<100;i++)); do > > > for t in ALL.iSCSITMF.LUNResetSimpleAsync ALL.MultipathIO.Reset; do > > > iscsi-test-cu --dataloss --allow-sanitize -t $t iscsi://$IP_ADDR1/tgt1/0 iscsi://$IP_ADDR1/tgt1/0 > > > done > > > done > > > > > > That loop completed in about five seconds. Sorry but that means that I am still > > > unable to reproduce the missing TMF reply that you have reported. > > > > Aha - If I run the test against a fileio backed LU then it passes, it > > fails against either of the iblock backed LUs. > > That is because all FILEIO backend I/O is synchronous, so no se_cmd > descriptors are ever hitting CMD_T_ABORTED for ABORT_TASK or LUN_RESET > in your test. ;) > Btw, a real simple way to trigger these bugs is to use a IBLOCK backend that doesn't complete BIOs back to target-core for an extended amount of time (say 180s seconds). The delay will result in open-iscsi issuing a ABORT_TASK that blocks waiting for backend I/O completion (first order issue), followed by session reinstatement (second order issue). As-is this series will create a bunch of hung tasks stuck in un-interruptible sleep (eg: 'D' state) forever. Since you've already verified with v4.10 is working as expected with your initial TMR LUN_RESET test case , it would be very useful to verify the first and second order issues work as expected in mainline with this type of pathological case, vs. the changes proposed here. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html