On 2018/3/21 0:51, Martin Wilck wrote: > The ppoll() calls of the uxlsnr thread are vital for proper functioning of > multipathd. If the uxlsnr thread can't open the socket or fails to call ppoll() > for other reasons, quit the daemon. If we don't do that, multipathd may > hang in a state where it can't be terminated any more, because the uxlsnr > thread is responsible for handling all signals. This happens e.g. if > systemd's multipathd.socket is running in and multipathd is started from > outside systemd. > > 24f2844 "multipathd: fix signal blocking logic" has made this problem more > severe. Before that patch, the signals weren't actually blocked in any thread. > That's not to say 24f2844 was wrong. I still think it's correct, we just > need this one on top. > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> > --- > multipathd/uxlsnr.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c > index cdafd82943e7..6f666663fc6f 100644 > --- a/multipathd/uxlsnr.c > +++ b/multipathd/uxlsnr.c > @@ -178,7 +178,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data) > > if (ux_sock == -1) { > condlog(1, "could not create uxsock: %d", errno); > - return NULL; > + exit_daemon(); > } > > pthread_cleanup_push(uxsock_cleanup, (void *)ux_sock); > @@ -187,7 +187,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data) > polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd)); > if (!polls) { > condlog(0, "uxsock: failed to allocate poll fds"); > - return NULL; > + exit_daemon(); > } > sigfillset(&mask); > sigdelset(&mask, SIGINT); > @@ -249,6 +249,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, void * trigger_data) > > /* something went badly wrong! */ > condlog(0, "uxsock: poll failed with %d", errno); > + exit_daemon(); > break; > } Hi Martin, Your analysis is reasonable. It is necessary to deal with fatal error not only to return, if not doing this multipathd can't exit normally and multipathd commands can't work any more. I think your patch is OK, but I have some ideas inspired by your patch. Calling exit_daemon() is to shut down the multipathd, relay on the outside to pull multipathd again. Is there a function can be use to deal with fatal error? Its function are close the socket(if create successfully before) and create a new socket to make uxlsnr thread work properly again or continue to create uxsocket? This function actually is try to repair those errors. It just an idea, maybe not quite right. Regards, Chongyun -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel