On Wed, 2019-01-16 at 18:04 -0600, Benjamin Marzinski wrote: > On Fri, Jan 04, 2019 at 06:59:14PM +0100, Martin Wilck wrote: > > The uxlsnr thread is responsible for receiving and handling > > signals in multipathd. This works well while it is idle and > > waiting for connections in ppoll(). But if it's busy, cli commands > > may need to take the vecs lock, which might take a long time. > > > > Use multipathd's ability to handle pending signals to avoid > > shutdown delays. > > > > If we find a deadly signal pending, try to use the remaining cycles > > to provide a meaningful error message to the client rather than > > simply break the connection. > > > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> > > --- > > multipathd/cli.c | 50 ++++++++++++++++++++++++++++++++++--------- > > ---- > > multipathd/cli.h | 2 +- > > multipathd/main.c | 4 +++- > > 3 files changed, 40 insertions(+), 16 deletions(-) > > > > diff --git a/multipathd/cli.c b/multipathd/cli.c > > index a75afe3f..824c01b6 100644 > > --- a/multipathd/cli.c > > +++ b/multipathd/cli.c > > @@ -12,7 +12,7 @@ > > #include "util.h" > > #include "version.h" > > #include <readline/readline.h> > > - > > +#include "main.h" > > #include "cli.h" > > > > static vector keys; > > @@ -453,13 +453,45 @@ genhelp_handler (const char *cmd, int error) > > return reply; > > } > > > > +static int cli_timedlock(struct mutex_lock *a, long ms) > > +{ > > + struct timespec tmo; > > + const long delta_ms = 100; > > + const long MILLION = 1000L * 1000; > > + const long delta_ns = MILLION * delta_ms; > > + const long BILLION = 1000L * MILLION; > > + int r; > > + > > + if (ms <= 0 || clock_gettime(CLOCK_REALTIME, &tmo) != 0) { > > + if (deliver_pending_signals()) > > + return EINTR; > > Why call pthread_testcancel() before pthread_mutex_lock()? > > I realize that the cancels are no longer blocked by holding the vecs > lock, but deliver_pending_signals() should have already told us if we > are about to shutdown. If we're not, it's very unlikely that we will > be > cancelled between that check and the pthread_testcancel(). However, > the > pthread_mutex_lock() might take some time, so it makes more sense to > check if we were cancelled after that. Or is there some other reason > for > the check being there? Call it a habit - checking for cancellation before taking a possibly blocking, long-running path. I call pthread_testcancel() again both after return from cli_timedlock() and in each loop iteration before calling pthread_mutex_timedlock(), so the cancellation test after locking is in place. I agree that being cancelled in that piece of code is very unlikely, but this is not about likelihood. Actually, it's kind of pointless to call pthread_testcancel() in code that's only executed in the context of the uxlsnr thread. Rather, we should immediately exit the thread when handle_signal() or deliver_pending_signals() tell us that shutdown is pending. Then child() wouldn't need to cancel the uxlsnr at all any more. I postponed that to after this patch set though. Thanks, Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel