[PATCH] multipathd: fix daemon not really shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Martin and Ben,

Cloud you help to review this patch, thanks.

Test environment: 25 hosts, each host have more than 100 luns,
each lun have two paths.
Some times when we try to ceate new multipath will encounter "could
not create uxsock:98" but the multipathd still running not shutdown
and can't response any multipathd commands also.

After reproduce this issue and debug, found below fixes might work:
(1) set_config_state() after pthread_cond_timedwait() other threads
might changed the running_state from DAEMON_SHUTDOWN to other status
like DAEMON_IDLE, which will make the shutdown process stopped.
I found logs to prove this really happened, so we need add judgement
here too.

(2) process exit signal as possible as we can.
If we just set a flag and wait handle_signals to handle exit signal,
there might have more 5s to wait because of ppoll maximal wait time.
And there might haven't enouth time left for all threads to clean up
and exit.

With this patch our tester not report this issue again.

Signed-off-by: Chongyun Wu <wu.chongyun@xxxxxxx>
---
  multipathd/main.c | 11 +++--------
  1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 991452930433..620b8264c82f 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -141,7 +141,6 @@ struct udev * udev;
  struct config *multipath_conf;

  /* Local variables */
-static volatile sig_atomic_t exit_sig;
  static volatile sig_atomic_t reconfig_sig;
  static volatile sig_atomic_t log_reset_sig;

@@ -247,7 +246,7 @@ int set_config_state(enum daemon_status state)
  			rc = pthread_cond_timedwait(&config_cond,
  						    &config_lock, &ts);
  		}
-		if (!rc) {
+		if (!rc && (running_state != DAEMON_SHUTDOWN)) {
  			running_state = state;
  			pthread_cond_broadcast(&config_cond);
  #ifdef USE_SYSTEMD
@@ -2517,11 +2516,6 @@ signal_set(int signo, void (*func) (int))
  void
  handle_signals(bool nonfatal)
  {
-	if (exit_sig) {
-		condlog(2, "exit (signal)");
-		exit_sig = 0;
-		exit_daemon();
-	}
  	if (!nonfatal)
  		return;
  	if (reconfig_sig) {
@@ -2546,7 +2540,8 @@ sighup (int sig)
  static void
  sigend (int sig)
  {
-	exit_sig = 1;
+	condlog(2, "exit (signal)");
+	exit_daemon();
  }

  static void
-- 
2.11.0



--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux