Re: [PATCH] multipathd: fix daemon not really shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chongyun, 

On Wed, 2018-12-26 at 06:34 +0000, Chongyun Wu wrote:
> Hi Martin and Ben,
> 
> Cloud you help to review this patch, thanks.
> 
> Test environment: 25 hosts, each host have more than 100 luns,
> each lun have two paths.
> Some times when we try to ceate new multipath will encounter "could
> not create uxsock:98" but the multipathd still running not shutdown
> and can't response any multipathd commands also.
> 
> After reproduce this issue and debug, found below fixes might work:
> (1) set_config_state() after pthread_cond_timedwait() other threads
> might changed the running_state from DAEMON_SHUTDOWN to other status
> like DAEMON_IDLE, which will make the shutdown process stopped.
> I found logs to prove this really happened, so we need add judgement
> here too.
> 
> (2) process exit signal as possible as we can.
> If we just set a flag and wait handle_signals to handle exit signal,
> there might have more 5s to wait because of ppoll maximal wait time.
> And there might haven't enouth time left for all threads to clean up
> and exit.
> 
> With this patch our tester not report this issue again.

please verify that you were using the latest upstream code,
in particular that f1c73962 "multipathd: make DAEMON_SHUTDOWN a
terminal state" and the predecessors 9de272eb, 234cab29 are
included in your code base. I made these patches
for the problem you are describing, and it worked for me.

Martin



> 
> Signed-off-by: Chongyun Wu <wu.chongyun@xxxxxxx>
> ---
>   multipathd/main.c | 11 +++--------
>   1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 991452930433..620b8264c82f 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -141,7 +141,6 @@ struct udev * udev;
>   struct config *multipath_conf;
> 
>   /* Local variables */
> -static volatile sig_atomic_t exit_sig;
>   static volatile sig_atomic_t reconfig_sig;
>   static volatile sig_atomic_t log_reset_sig;
> 
> @@ -247,7 +246,7 @@ int set_config_state(enum daemon_status state)
>   			rc = pthread_cond_timedwait(&config_cond,
>   						    &config_lock, &ts);
>   		}
> -		if (!rc) {
> +		if (!rc && (running_state != DAEMON_SHUTDOWN)) {
>   			running_state = state;
>   			pthread_cond_broadcast(&config_cond);
>   #ifdef USE_SYSTEMD
> @@ -2517,11 +2516,6 @@ signal_set(int signo, void (*func) (int))
>   void
>   handle_signals(bool nonfatal)
>   {
> -	if (exit_sig) {
> -		condlog(2, "exit (signal)");
> -		exit_sig = 0;
> -		exit_daemon();
> -	}
>   	if (!nonfatal)
>   		return;
>   	if (reconfig_sig) {
> @@ -2546,7 +2540,8 @@ sighup (int sig)
>   static void
>   sigend (int sig)
>   {
> -	exit_sig = 1;
> +	condlog(2, "exit (signal)");
> +	exit_daemon();
>   }
> 
>   static void


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux