On 11/20/2015 02:52 AM, Hannes Reinecke wrote:
One thing, though: I don't really agree with Barts objection that
moving to a workqueue would tie in too many resources.
Thing is, I'm not convinces that using a work queue is allocating
too many resources (we're speaking of 460 vs 240 bytes here).
Also we have to retry commands for quite some time (cite the
infamous NetApp takeover/giveback, which can take minutes).
If we were to handle that without workqueue we'd have to initiate
the retry from the end_io callback, causing a quite deep stack
recursion. Which I'm not really fond of.
Hello Hannes,
Sorry if I wasn't clear enough in my previous e-mail about this topic
but I'm more concerned about the additional memory needed for thread
stacks and thread control data structures than about the additional
memory needed for the workqueue. I'd like to see the ALUA device handler
implementation scale to thousands of LUNs and target port groups. In
case all connections between an initiator and a target port group fail,
with a synchronous implementation of STPG we will either need a large
number of threads (in case of one thread per STPG command) or the STPG
commands will be serialized (if there are fewer threads than portal
groups). Neither alternative looks attractive to me.
BTW, not all storage arrays need STPG retries. Some arrays are able to
process an STPG command quickly (this means within a few seconds).
A previous discussion about this topic is available e.g. at
http://thread.gmane.org/gmane.linux.scsi/105340/focus=105601.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html