On 03/08/2017 10:02 AM, Mike Christie wrote:
On 03/08/2017 03:05 AM, Nicholas A. Bellinger wrote:
Hi MNC,
I know Andy has stepped back from TCMU, but I wanted to get his input
(CC'ed) as the original author.
Sure. I'm in agreement with Mike on what needs to happen, but I can
maybe add some comments from a "historical" TCMU angle.
On Wed, 2017-03-01 at 23:14 -0600, Mike Christie wrote:
This patch removes the tcmu the cmd timeout code. Like with the core
target_core_tmr.c code, the kernel does not know the state of the command
or implemenation details of the device, so it might not be safe to just
return the command to the initiator for a possible retry.
Signed-off-by: Mike Christie <mchristi@xxxxxxxxxx>
Applied for the moment, but I'm curious wrt the implications for this..?
Does this mean all outstanding se_cmd descriptors dispatched into
user-space must always complete outstanding I/Os, otherwise target-core
will block indefinitely waiting for completion a la other in-kernel
backend drivers..?
Yes.
Yeah. originally having this timeout was not correct. The initiator is
responsible for aborting (via a TMF) and retrying commands, and
intermediate layers like TCMU having their own timeouts is a bad idea.
Historically we've always considered a backend device that never
completes outstanding se_cmd descriptors to be a buggy driver.
But existing user-space code that uses TCMU (I assume) is allowed to
restart, and is currently not responsible for completing I/Os after the
restart..? Is that the case..?
I wonder if this change may end up being problematic because of this
assumption by user-space..?
I think we need a module param or per device flag.
There is this kernel timeout code that handles the easy part where we
can get se_cmd descriptors completed.
There is no tcmu-runner restart code to figure out the state of those
commands. For example if we were in the WRITE part of a
COMPARE_AND_WRITE and the WRITE is floating around at the time of the
tcmu-runner crash, then when it star
then we do not timeout internally. tcmu-runner would set this. The
default is 30 like it is currently.
My tcmu-runner plan was:
1. For hung commands, we are going to add callouts to the backend module
for TMFs. target_core_tmr.c would call them during TMFs like ABORT_TASK
to have the backend perform the TMF. target_core_user would then call
into tcmu-runner and that would call into the userspace handler.
2. For tcmu-runner restarts, I am working on the ability to block
backends, so at restart time tcmur-runner can loop over the oustanding
commands and call into handlers to figure out if they are running still.
If they are, I am debating what to do still. We could have tcmu runner
will call the handler callouts used for #1 if needed.
Originally, the shared memory area kind of served as a recoverable
journal of cmds issued to userspace: if userspace died then they'd still
be there when userspace restarted and re-mmap()ed the area, and it could
pick up and continue.
However this stopped being the case when we enabled cmds to be completed
out of order. We may reuse an entry A on the TCMU cmd ring to return
status for cmd B, which means that if userspace chooses that moment to
crash then a restarted userspace will not know it still needs to do A
(as well as probably doing B twice). Userspace must assume the burden of
remembering its state, or refuse to work on cmds in the cmd ring when
restarted. Mike's plan #1 is important to clear out these hung commands,
and then #2, because tcmu-runner has its own internal modularization
("handlers") and maybe tcmu-runner doesn't even know the state of
in-flight cmds, but a handler does. Fun.
Anyway, I don't know if I'm adding any new info here but just adding my 2c.
Regards -- Andy
p.s. just thinking here... it would, with some work, be possible to add
a separate cmd completion ring backwards-compatibly. Haven't thought it
through but just tossing that out there.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html