Re: [PATCH 00/18] ALUA device handler update, part 1

Bart Van Assche <bart.vanassche@xxxxxxxxxxx> · Mon, 23 Nov 2015 08:18:13 -0800

On 11/23/2015 08:10 AM, Hannes Reinecke wrote:
On 11/20/2015 11:58 PM, Bart Van Assche wrote:
On 11/20/2015 02:52 AM, Hannes Reinecke wrote:
One thing, though: I don't really agree with Barts objection that
moving to a workqueue would tie in too many resources.
Thing is, I'm not convinces that using a work queue is allocating
too many resources (we're speaking of 460 vs 240 bytes here).
Also we have to retry commands for quite some time (cite the
infamous NetApp takeover/giveback, which can take minutes).
If we were to handle that without workqueue we'd have to initiate
the retry from the end_io callback, causing a quite deep stack
recursion. Which I'm not really fond of.

Hello Hannes,

Sorry if I wasn't clear enough in my previous e-mail about this
topic but I'm more concerned about the additional memory needed for
thread stacks and thread control data structures than about the
additional memory needed for the workqueue. I'd like to see the ALUA
device handler implementation scale to thousands of LUNs and target
port groups. In case all connections between an initiator and a
target port group fail, with a synchronous implementation of STPG we
will either need a large number of threads (in case of one thread
per STPG command) or the STPG commands will be serialized (if there
are fewer threads than portal groups). Neither alternative looks
attractive to me.

BTW, not all storage arrays need STPG retries. Some arrays are able
to process an STPG command quickly (this means within a few seconds).

A previous discussion about this topic is available e.g. at
http://thread.gmane.org/gmane.linux.scsi/105340/focus=105601.

Well, one could argue that the whole point of this patchset is to
allow you to serialize STPGs :-)

We definitely need to serialize STPGs for the same target port
group; the current implementation is far too limited to take that
into account.

But the main problem I'm facing with the current implementation is
that we cannot handle retries. An RTPG or an STPG might fail, at
which point we need to re-run RTPG to figure out the current status.
(We also need to send RTPGs when we receive an "ALUA state changed"
  UA, but that's slightly beside the point).
The retry cannot be send directly, as we're evaluating the status
from end_io context. So to instantiate a retry we need to move it
over to a workqueue.

Or, at least, that's the solution I'm able to come up with.
If you have other ideas it'd be most welcome.

Hello Hannes,

I agree that retries have to be handled from workqueue context instead 
of end_io context. But in workqueue context we can choose whether to 
submit the retry synchronously or asynchronously. Unless I overlooked 
something I don't see why the retry should be submitted synchronously.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html