Re: [PATCH 00/18] ALUA device handler update, part 1

Hannes Reinecke <hare@xxxxxxx> · Mon, 23 Nov 2015 17:10:46 +0100

On 11/20/2015 11:58 PM, Bart Van Assche wrote:
> On 11/20/2015 02:52 AM, Hannes Reinecke wrote:
>> One thing, though: I don't really agree with Barts objection that
>> moving to a workqueue would tie in too many resources.
>> Thing is, I'm not convinces that using a work queue is allocating
>> too many resources (we're speaking of 460 vs 240 bytes here).
>> Also we have to retry commands for quite some time (cite the
>> infamous NetApp takeover/giveback, which can take minutes).
>> If we were to handle that without workqueue we'd have to initiate
>> the retry from the end_io callback, causing a quite deep stack
>> recursion. Which I'm not really fond of.
> 
> Hello Hannes,
> 
> Sorry if I wasn't clear enough in my previous e-mail about this
> topic but I'm more concerned about the additional memory needed for
> thread stacks and thread control data structures than about the
> additional memory needed for the workqueue. I'd like to see the ALUA
> device handler implementation scale to thousands of LUNs and target
> port groups. In case all connections between an initiator and a
> target port group fail, with a synchronous implementation of STPG we
> will either need a large number of threads (in case of one thread
> per STPG command) or the STPG commands will be serialized (if there
> are fewer threads than portal groups). Neither alternative looks
> attractive to me.
> 
> BTW, not all storage arrays need STPG retries. Some arrays are able
> to process an STPG command quickly (this means within a few seconds).
> 
> A previous discussion about this topic is available e.g. at
> http://thread.gmane.org/gmane.linux.scsi/105340/focus=105601.
> 
Well, one could argue that the whole point of this patchset is to
allow you to serialize STPGs :-)

We definitely need to serialize STPGs for the same target port
group; the current implementation is far too limited to take that
into account.

But the main problem I'm facing with the current implementation is
that we cannot handle retries. An RTPG or an STPG might fail, at
which point we need to re-run RTPG to figure out the current status.
(We also need to send RTPGs when we receive an "ALUA state changed"
 UA, but that's slightly beside the point).
The retry cannot be send directly, as we're evaluating the status
from end_io context. So to instantiate a retry we need to move it
over to a workqueue.

Or, at least, that's the solution I'm able to come up with.
If you have other ideas it'd be most welcome.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@xxxxxxx			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html