Re: [PATCH 3/3] tcmu: remove cmd timeout code

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Wed, 08 Mar 2017 20:16:05 -0800

On Wed, 2017-03-08 at 12:02 -0600, Mike Christie wrote:
> On 03/08/2017 03:05 AM, Nicholas A. Bellinger wrote:
> > Hi MNC,
> > 
> > I know Andy has stepped back from TCMU, but I wanted to get his input
> > (CC'ed) as the original author.
> > 
> > On Wed, 2017-03-01 at 23:14 -0600, Mike Christie wrote:
> >> This patch removes the tcmu the cmd timeout code. Like with the core
> >> target_core_tmr.c code, the kernel does not know the state of the command
> >> or implemenation details of the device, so it might not be safe to just
> >> return the command to the initiator for a possible retry.
> >>
> >> Signed-off-by: Mike Christie <mchristi@xxxxxxxxxx>
> > 
> > Applied for the moment, but I'm curious wrt the implications for this..?
> > 
> > Does this mean all outstanding se_cmd descriptors dispatched into
> > user-space must always complete outstanding I/Os, otherwise target-core
> > will block indefinitely waiting for completion a la other in-kernel
> > backend drivers..? 
> 
> Yes.
> 
> > 
> > Historically we've always considered a backend device that never
> > completes outstanding se_cmd descriptors to be a buggy driver.
> > But existing user-space code that uses TCMU (I assume) is allowed to
> > restart, and is currently not responsible for completing I/Os after the
> > restart..?  Is that the case..?
> > 
> > I wonder if this change may end up being problematic because of this
> > assumption by user-space..?
> 
> I think we need a module param or per device flag.
> 
> There is this kernel timeout code that handles the easy part where we
> can get se_cmd descriptors completed.
> 
> There is no tcmu-runner restart code to figure out the state of those
> commands. For example if we were in the WRITE part of a
> COMPARE_AND_WRITE and the WRITE is floating around at the time of the
> tcmu-runner crash, then when it starts up again, we would just allow 2
> COMPARE_AND_WRITEs to be executing at the same time.
> 
> However, I see your point that other daemons could be implementing the
> needed cleanup and be safe.
> 
> How about a modparam that specifies the timeout length. If set to 0,
> then we do not timeout internally. tcmu-runner would set this. The
> default is 30 like it is currently.

Makes sense for existing compatibility.

Even better than modparam, add a TCMU specific configfs attribute via
tcmu_ops->tb_dev_attrib_attrs and expose it per device.

> 
> My tcmu-runner plan was:
> 
> 1. For hung commands, we are going to add callouts to the backend module
> for TMFs. target_core_tmr.c would call them during TMFs like ABORT_TASK
> to have the backend perform the TMF. target_core_user would then call
> into tcmu-runner and that would call into the userspace handler.
> 

Mmmm.  This would mean TMRs would function very different between TCMU
and non TCMU backends..

We've always had the assumption that once a TMR is processed, code
blocks until the backend completes outstanding se_cmd(s) in question
without an explicit cancellation trigger.

One approach I've taken for a out-of-tree make_request_fn() based bio
driver is return bios with -EAGAIN if a bio takes more than say 3
seconds to complete, in order to signal to IBLOCK to propagate up a
SAM_STAT_BUSY or SAM_STAT_QUEUE_FULL.  This ends up working quite well
to keep initiators happy, and virtually eliminates spurious TMRs in host
environments (like ESX for example) with a very low SCSI timeout. 

Of course, this pushes alot of smarts into the driver below the backend,
but in a distributed scale out backend where I/Os delays, et al. are a
fact of life this is difficult to avoid.

So that said, I'd prefer to have TCMU's user-space code return back I/Os
that are expected to take a long time with SAM_STAT_BUSY or
SAM_STAT_QUEUE_FULL (to reduce host queue_depth), instead of adding a
explicit se_cmd cancellation mechanism in target-core into backend
driver code.

I'd be happy to post the target-core changes needed to do this, which
where posted at one point but ended up not getting merged. 

> 2. For tcmu-runner restarts, I am working on the ability to block
> backends, so at restart time tcmur-runner can loop over the oustanding
> commands and call into handlers to figure out if they are running still.
> If they are, I am debating what to do still. We could have tcmu runner
> will call the handler callouts used for #1 if needed.
> 

Yeah, returning back with SAM_STAT_BUSY or SAM_STAT_QUEUE_FULL would be
good idea here too.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html