Hello Martin,
From a transport side an example would be an Ethernet Pause or with Infiniband there is also an ECN (Explicit Congestion Notification) and I'm sure there may be others for other transport mechanisms.
I don't know how much of these are currently being used in other parts of the stack but I think that when a modular approach can be created with a common set of configuration options from a multipathd level you would seriously do the entire Linux and Storage sysadmin ecosystem a huge favour.
I would need to dive into the stacks or seek help from the maintainers of these code-bases to know more of what the options are.
Cheers
Erwin
On Tue, 2021-12-07 at 09:11 +0000, Martin Wilck wrote:
On Tue, 2021-12-07 at 09:19 +1000, Erwin van Londen wrote:Hello Martin, Muneendra.As I kicked this discussion off in the beginning of the year andseeing the Muneendra and the broadcom people have come up with thefirst iteration I can only applaud the efforts. On behalf of allstorage and linux administrators I would say "Thank you".As for your remark Martin my view would be to try and create amodular approach where the transport layer drivers can hook into andinform multipathd of any event. The module in multipathd would thendecide based on configured characteristics what the actions shouldbe. (Take it offline, suspend for X amount of time, introduce X usdelay etc...) That way when more transport methods are used these canthen dynamically be linked into the configuration without having anyimpact on other parts of the transport stack. I can imagine thatInfiniband. ethernet, SAS and others utilise different transportcharacteristics and as such may need to inform the attached hosts ofone or more events. On FC this is FPIN but a similar module may bewritten for other transports.Interesting idea. Are you aware of a technology for non-FC transportsthat could take the role of FPIN? I have to admit I'm not, but thatdoesn't mean they don't exist or won't exist in the future.In the first place we'd need to "hook in" an event listener. Like withMuneendra's patch, we're adding a new class of events that we'relistening to. The events would then than collected and processed byseparate worker thread (which unlike the listener would take themultipath lock), setting paths states to marginal or back to normal.I don't think we want to add plug-ins that spawn their own independentthreads, though. That sounds very difficult to handle properly, and wealready have more than enough complexity.If we want to modularize this, we need a *generic* event listenerthread. A module would basically provide an fd for that thread to pollon, and a callback to be called when an event occurs. This idea appealsto me a lot, in particular because we already have an event listener(the uevent listener thread) which is sitting idle most of the time.So Muneendra, instead of creating a new receiver thread, you wouldextend the existing uevent listener to handle the FPIN events as well.The thread would now add uevents to the uevent list and FPIN events tothe FPIN events list.Next, we'd also need a generic event consumer, with callbacks fordifferent types of marginal state handlers. Perhaps this could even bethe uevent trigger thread? The uevent trigger has more work to do thanthe uevent listener. But any handler thread that wants to modify pathstate would need to take the lock anyway, effectively serializing alloperations. So I guess we might as well use both uevent threads for"transport event notification" reception and processing, respectively.We also need to think about whether the currently existing marginalpath handler could fit into this framework. Not so well probably,because it's not event driven and hooks into check_path(). OTOH, maybepossible future mechanisms might hook into check_path(), too, so we'dneed a generic callback there?Moreover, the existing marginal paths handler has two different modesof operation, the "classical" one that disables reinstate, and themore modern one that uses marginal pathgroups. I am wondering whetherwe need the first mode in the long run. In particular if we want togeneralize this feature, we may want to get rind of the "classical"mode altogether. I'm not aware of any distinct advantages of thatalgorithm compared to marginal path groups.@Ben, Muneendra, what do you think?One word of caution here: we must be careful not to over-engineer.As long as no other mechanism like FPIN for other transports isconceivable, generalizing the concept makes only so much sense.Therefore we shouldn't hold back the FPIN patches until we haveconceived of a generic mechanism, which may take a lot of time todevelop. If another mechanism becomes available, we could still try togeneralize the concept, if we keep the current additions clean andwell-separated from the core multipathd code.However I am really thrilled by the prospect of generalizing eventhandling and reusing the uevent threads for FPIN. That would reducecomplexity a lot, which is a good thing IMO.@Ben, Muneendra, again, your opinions?BestMartin
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel