[PATCH v3 1/2] libmultipath: fix race in stop_io_err_stat_thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's wrong, and unnecessary, to call pthread_kill() after
pthread_cancel(). I have observed cases where the io_err checker
thread hung in libpthread after receiving the USR2 signal, in particular
when multipathd is run under strace. (If multipathd is killed with
SIGINT under strace, and the io_error thread is running, it happens
almost every time). If this happens, the io_err thread
tries to obtain a mutex in the urcu code (presumably rcu_unregister_thread())
and the main thread hangs in pthread_join(). multipathd can only be
terminated with kill -KILL in this situation.

With the change from this patch, the thread is shut down cleanly. I haven't
observed the hang under strace with the patch.

Fixes: 95d594fd "multipath-tools: intermittent IO error accounting to improve
reliability"

Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
---
 libmultipath/io_err_stat.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 00bac9e0e755..536ba87968fd 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -749,7 +749,6 @@ destroy_ctx:
 void stop_io_err_stat_thread(void)
 {
 	pthread_cancel(io_err_stat_thr);
-	pthread_kill(io_err_stat_thr, SIGUSR2);
 	pthread_join(io_err_stat_thr, NULL);
 	free_io_err_pathvec(paths);
 	io_destroy(ioctx);
-- 
2.16.1

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux