Hi Martin and Ben, Cloud you help to review this patch, thanks. Test environment: 25 hosts, each host have more than 100 luns, each lun have two paths. Some times when we try to ceate new multipath will encounter "could not create uxsock:98" but the multipathd still running not shutdown and can't response any multipathd commands also. After reproduce this issue and debug, found below fixes might work: (1) set_config_state() after pthread_cond_timedwait() other threads might changed the running_state from DAEMON_SHUTDOWN to other status like DAEMON_IDLE, which will make the shutdown process stopped. I found logs to prove this really happened, so we need add judgement here too. (2) process exit signal as possible as we can. If we just set a flag and wait handle_signals to handle exit signal, there might have more 5s to wait because of ppoll maximal wait time. And there might haven't enouth time left for all threads to clean up and exit. With this patch our tester not report this issue again. Signed-off-by: Chongyun Wu <wu.chongyun@xxxxxxx> --- multipathd/main.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/multipathd/main.c b/multipathd/main.c index 991452930433..620b8264c82f 100644 --- a/multipathd/main.c +++ b/multipathd/main.c @@ -141,7 +141,6 @@ struct udev * udev; struct config *multipath_conf; /* Local variables */ -static volatile sig_atomic_t exit_sig; static volatile sig_atomic_t reconfig_sig; static volatile sig_atomic_t log_reset_sig; @@ -247,7 +246,7 @@ int set_config_state(enum daemon_status state) rc = pthread_cond_timedwait(&config_cond, &config_lock, &ts); } - if (!rc) { + if (!rc && (running_state != DAEMON_SHUTDOWN)) { running_state = state; pthread_cond_broadcast(&config_cond); #ifdef USE_SYSTEMD @@ -2517,11 +2516,6 @@ signal_set(int signo, void (*func) (int)) void handle_signals(bool nonfatal) { - if (exit_sig) { - condlog(2, "exit (signal)"); - exit_sig = 0; - exit_daemon(); - } if (!nonfatal) return; if (reconfig_sig) { @@ -2546,7 +2540,8 @@ sighup (int sig) static void sigend (int sig) { - exit_sig = 1; + condlog(2, "exit (signal)"); + exit_daemon(); } static void -- 2.11.0 -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel