On Fri, Jun 19, 2020 at 08:32:34AM +0200, Hannes Reinecke wrote: > >> > >>fac68d7 is related to the famous "dm-multipath: Accept failed paths for > >>multipath maps" patch (e.g. > >>https://patchwork.kernel.org/patch/3368381/#7193001), which never made > >>it upstream. SUSE kernels have shipped this patch for a long time, but > >>we don't apply it any more in recent kernels. > >> > >>With this patch, The reinstate_path() failure would occur if multipathd > >>had created a table with a "disabled" device (one which would be > >>present in a dm map even though the actual block device didn't exist), > >>and then tried to reinstate such a path. It sounds unlikely, but it > >>might be possible if devices are coming and going in quick succession. > >>In that situation, and with the "accept failed path" patch applied, a > >>reload makes some sense, because reload (unlike reinstate) would not > >>fail (at least not for this reason) and would actually add that just- > >>reinstated path. OTOH, it's not likely that the reload would improve > >>matters, either. After all, multipathd is just trying to reinstate a > >>non-existing path. So, I'm fine with skipping the reload. > >> > It's actually _not_ unlikely, but a direct result of multipathd listening to > uevents. > > If you have a map with four paths, and all four of them are going down, you > end up getting four events. > And multipath will be processing each event _individually_, causing it to > generate a reload sequence like: > > [a b c d] > [b c d] > [c d] > [d] > [] > > Of which only the last is valid, as all the intermediate ones contain > invalid paths. > > And _that_ is the scenario I was referring to when creating the patch. But even still, I'm not in favor of calling ev_add_path() with the code as it currently is. In the case you point out, it will presumably fail when adopt_paths() calls pathinfo(). That will orphan the path, which is good. But if the map got removed (intentionally) instead of the path, ev_add_path() will recreate the map, which is bad. Also, it seems more likely for a path to still exist and be usable while the multipath device is gone (the bad case), than for the path to pass the checker function and then disappear before mutipathd tries to reinstate it immediately afterwards (the good case). Further, the bad result, where a deleted multipath device reappears, is actually a problem for users. Having multipathd take a bit of time and pointless effort to catch up after multiple changes happen at once doesn't effect users noticeably. Since the checkerloop thread spends the vast majority of it's not not checking any specific path, if a path goes away, it is most likely to be caught by the path_offline() function. I we want to do something to proactively deal with a path that has been removed, we should do it there. -Ben > Cheers, > > Hannes > -- > Dr. Hannes Reinecke Teamlead Storage & Networking > hare@xxxxxxx +49 911 74053 688 > SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg > HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel