Re: [PATCH v2 00/36] SCSI target patches for kernel v4.11

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2017-02-07 at 11:59 +0100, David Disseldorp wrote:
> On Mon, 6 Feb 2017 19:35:47 +0000, Bart Van Assche wrote:
> 
> > On Mon, 2017-02-06 at 17:44 +0100, David Disseldorp wrote:
> > > FWIW, this was configured using the script at:
> > > https://github.com/ddiss/rapido/blob/master/lio_local_autorun.sh  
> > 
> > Hello David,
> > 
> > Thanks for having provided that script, that's very helpful. I ran that script
> > after I had entered the following:
> > 
> > _fatal() {
> >     exit 1
> > }
> > 
> > DYN_DEBUG_MODULES=
> > DYN_DEBUG_FILES=
> > INITIATOR_IQNS="
> > iqn.2007-10.com.github:sahlberg:libiscsi:iscsi-test
> > iqn.2007-10.com.github:sahlberg:libiscsi:iscsi-test-2
> > "
> > TARGET_IQN=tgt1
> > IP_ADDR1=$(ip addr show dev eth0 | sed -n 's,^[[:blank:]]*inet \([^/]*\)/.*$,\1,p')
> > MAC_ADDR1=
> > IP_ADDR2=
> > MAC_ADDR2=foobar
> > 
> > Next, I ran the two libiscsi tests mentioned earlier:
> > 
> > for ((i=0;i<100;i++)); do
> >   for t in ALL.iSCSITMF.LUNResetSimpleAsync ALL.MultipathIO.Reset; do
> >     iscsi-test-cu --dataloss --allow-sanitize -t $t iscsi://$IP_ADDR1/tgt1/0 iscsi://$IP_ADDR1/tgt1/0
> >   done
> > done
> > 
> > That loop completed in about five seconds. Sorry but that means that I am still
> > unable to reproduce the missing TMF reply that you have reported.
> 
> Aha - If I run the test against a fileio backed LU then it passes, it
> fails against either of the iblock backed LUs.

That is because all FILEIO backend I/O is synchronous, so no se_cmd
descriptors are ever hitting CMD_T_ABORTED for ABORT_TASK or LUN_RESET
in your test.  ;)

> Perhaps this race is
> dependent on the I/O making it to the backstore/block layer by the time
> the LU RESET request comes in? In the past I hit a bug similar to this
> (in the ABORT TASK path), and used the dm-delay device (setup by the
> script) to trip the race.
> 
> Do you see the failure when testing against LUN1 or LUN2?

The fatal flaw with patch #19 is the new se_cmd->finished completion
introduced to handle all CMD_T_ABORTED cases can never make forward
progress in any case, because CMD_T_ABORTED logic takes it's own
se_cmd->cmd_kref in __target_check_io_state(), and then blocks on
wait_for_completion_timeout(&se_cmd->finished).

In order to complete se_cmd->finished, se_cmd->cmd_kref must reach zero
to call target_release_cmd_kref() -> complete_all(&se_cmd->finished),
but since the tmr kthread caller who is blocked on se_cmd->finished
holds the final se_cmd->cmd_kref reference, it's fatal for the simple
first order scenario every time.

Patch #19 + #20 breaks the second order issue where CMD_T_ABORTED
happens concurrently with se_session shutdown CMD_T_FABRIC_STOP too.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux