Re: [RFC PATCH 6/6] multipathd: uxlsnr: handle signals while busy

Martin Wilck <mwilck@xxxxxxxx> · Thu, 17 Jan 2019 11:19:43 +0100

On Wed, 2019-01-16 at 18:04 -0600, Benjamin Marzinski wrote:
> On Fri, Jan 04, 2019 at 06:59:14PM +0100, Martin Wilck wrote:
> > The uxlsnr thread is responsible for receiving and handling
> > signals in multipathd. This works well while it is idle and
> > waiting for connections in ppoll(). But if it's busy, cli commands
> > may need to take the vecs lock, which might take a long time.
> > 
> > Use multipathd's ability to handle pending signals to avoid
> > shutdown delays.
> > 
> > If we find a deadly signal pending, try to use the remaining cycles
> > to provide a meaningful error message to the client rather than
> > simply break the connection.
> > 
> > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> > ---
> >  multipathd/cli.c  | 50 ++++++++++++++++++++++++++++++++++---------
> > ----
> >  multipathd/cli.h  |  2 +-
> >  multipathd/main.c |  4 +++-
> >  3 files changed, 40 insertions(+), 16 deletions(-)
> > 
> > diff --git a/multipathd/cli.c b/multipathd/cli.c
> > index a75afe3f..824c01b6 100644
> > --- a/multipathd/cli.c
> > +++ b/multipathd/cli.c
> > @@ -12,7 +12,7 @@
> >  #include "util.h"
> >  #include "version.h"
> >  #include <readline/readline.h>
> > -
> > +#include "main.h"
> >  #include "cli.h"
> >  
> >  static vector keys;
> > @@ -453,13 +453,45 @@ genhelp_handler (const char *cmd, int error)
> >  	return reply;
> >  }
> >  
> > +static int cli_timedlock(struct mutex_lock *a, long ms)
> > +{
> > +	struct timespec tmo;
> > +	const long delta_ms = 100;
> > +	const long MILLION = 1000L * 1000;
> > +	const long delta_ns = MILLION * delta_ms;
> > +	const long BILLION = 1000L * MILLION;
> > +	int r;
> > +
> > +	if (ms <= 0 || clock_gettime(CLOCK_REALTIME, &tmo) != 0) {
> > +		if (deliver_pending_signals())
> > +			return EINTR;
> 
> Why call pthread_testcancel() before pthread_mutex_lock()?
> 
> I realize that the cancels are no longer blocked by holding the vecs
> lock, but deliver_pending_signals() should have already told us if we
> are about to shutdown. If we're not, it's very unlikely that we will
> be
> cancelled between that check and the pthread_testcancel(). However,
> the
> pthread_mutex_lock() might take some time, so it makes more sense to
> check if we were cancelled after that. Or is there some other reason
> for
> the check being there?

Call it a habit - checking for cancellation before taking a possibly
blocking, long-running path. I call pthread_testcancel() again both
after return from cli_timedlock() and in each loop iteration before
calling pthread_mutex_timedlock(), so the cancellation test after
locking is in place. I agree that being cancelled in that piece of code
is very unlikely, but this is not about likelihood.

Actually, it's kind of pointless to call pthread_testcancel() in code
that's only executed in the context of the uxlsnr thread. Rather, we
should immediately exit the thread when handle_signal() or
deliver_pending_signals() tell us that shutdown is pending. Then
child() wouldn't need to cancel the uxlsnr at all any more.
I postponed that to after this patch set though.

Thanks,
Martin

-- 
Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel