On Fr, 2018-03-23 at 15:00 -0500, Benjamin Marzinski wrote: > While the rcu code is waiting for a grace period to elapse, no > threads > can register or unregister as rcu reader threads. If for some reason, > a > thread never calls put_multipath_config() to exit a read side > critical > section, then any threads trying to start or stop will hang. This can > happen if a thread is cancelled between calls to > get_multipath_config() > and put_multipath_config(), and multipathd is reconfigured (which > causes > the rcu code to wait for a grace period). > > This patch fixes this issue in two ways. Where possible, it reorders > the > code or saves config values into local variables to remove > cancellation > points between calls to get_multipath_config() and > put_multipath_config(). In cases where this isn't possible (or where > it > would cause a significant amount of extra work to be done) multipath > now > pushes a cleanup handler to call put_multipath_config(). > > The only functions that were not modified were ones that were only > called by multipath or mpathpersist, since these are single threaded > and already disable rcu thread registration. > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> Kudos for doing this meticulous work! Reviewed-by: Martin Wilck <mwilck@xxxxxxxx> (I admit my review wasn't in depth. I fully ack the idea of the patch, and I scanned through it without spotting obvious errors. I did not check whether you should have changed more code as you already did). Here's a suggestion, as I think this is getting pretty ugly (not your fault). Maybe we should rename get_multipath_config() to __get_multipath_config() and do something like #define begin_with_config(conf) \ __get_multipath_config(conf); \ pthread_cleanup_push(__put_multipath_config, conf); \ do #define end_with_config(conf) \ while(0); \ pthread_cleanup_pop(1) ... and require that all code blocks accessing the configuration should be coded like this: begin_with_config(conf) { ... CODE ... } end_with_config(conf); IMO that'd improve readability and reduce likelihood of errors. As you're touching so many lines of code anyway, that wouldn't be that much harder :-/ Regards, Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel