Re: glibc mutex deadlock in signal handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 03 Sep 2015 22:55:52 +0200,
Andreas Schwab wrote:
> 
> See
> <http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03>
> for the complete list of functions you may safely call from a signal
> handler.

Thanks, this looks more comprehensive.
FWIW, below is the updated patch with a proper change log.


Takashi

-- 8< --
From: Takashi Iwai <tiwai@xxxxxxx>
Subject: [PATCH] pager: don't use unsafe functions in signal handlers

Since the commit [a3da8821208d: pager: do wait_for_pager on signal
death], we call wait_for_pager() in the pager's signal handler.  The
recent bug report revealed that this causes a deadlock in glibc at
aborting "git log" [*1].  When this happens, git process is left
unterminated, and it can't be killed by SIGTERM but only by SIGKILL.

The problem is that wait_for_pager() function does more than waiting
for pager process's termination, but it does cleanups and printing
errors.  Unfortunately, the functions that may be used in a signal
handler are very limited [*2].  Particularly, malloc(), free() and the
variants can't be used in a signal handler because they take a mutex
internally in glibc.  This was the cause of the deadlock above.  Other
than the direct calls of malloc/free, many functions calling
malloc/free can't be used.  strerror() is such one, either.

Also the usage of fflush() and printf() in a signal handler is bad,
although it seems working so far.  In a safer side, we should avoid
them, too.

This patch tries to reduce the calls of such functions in signal
handlers.  wait_for_pager_signal() is now open-coded not to call
unsafe functions, and finish_command_in_signal() is introduced for the
same reason.  There the free() calls are removed, and only waits for
the children without whining at errors.

[*1] https://bugzilla.opensuse.org/show_bug.cgi?id=942297
[*2] http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

Signed-off-by: Takashi Iwai <tiwai@xxxxxxx>
---
 pager.c       |  5 ++++-
 run-command.c | 25 +++++++++++++++++--------
 run-command.h |  1 +
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/pager.c b/pager.c
index 27d4c8a17aa1..12d17af73745 100644
--- a/pager.c
+++ b/pager.c
@@ -26,7 +26,10 @@ static void wait_for_pager(void)
 
 static void wait_for_pager_signal(int signo)
 {
-	wait_for_pager();
+	/* signal EOF to pager */
+	close(1);
+	close(2);
+	finish_command_in_signal(&pager_process);
 	sigchain_pop(signo);
 	raise(signo);
 }
diff --git a/run-command.c b/run-command.c
index 3277cf797ed4..e09275bd9e36 100644
--- a/run-command.c
+++ b/run-command.c
@@ -18,26 +18,27 @@ struct child_to_clean {
 static struct child_to_clean *children_to_clean;
 static int installed_child_cleanup_handler;
 
-static void cleanup_children(int sig)
+static void cleanup_children(int sig, int in_signal)
 {
 	while (children_to_clean) {
 		struct child_to_clean *p = children_to_clean;
 		children_to_clean = p->next;
 		kill(p->pid, sig);
-		free(p);
+		if (!in_signal)
+			free(p);
 	}
 }
 
 static void cleanup_children_on_signal(int sig)
 {
-	cleanup_children(sig);
+	cleanup_children(sig, 1);
 	sigchain_pop(sig);
 	raise(sig);
 }
 
 static void cleanup_children_on_exit(void)
 {
-	cleanup_children(SIGTERM);
+	cleanup_children(SIGTERM, 0);
 }
 
 static void mark_child_for_cleanup(pid_t pid)
@@ -220,7 +221,7 @@ static inline void set_cloexec(int fd)
 		fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
 }
 
-static int wait_or_whine(pid_t pid, const char *argv0)
+static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 {
 	int status, code = -1;
 	pid_t waiting;
@@ -228,6 +229,8 @@ static int wait_or_whine(pid_t pid, const char *argv0)
 
 	while ((waiting = waitpid(pid, &status, 0)) < 0 && errno == EINTR)
 		;	/* nothing */
+	if (in_signal)
+		return 0;
 
 	if (waiting < 0) {
 		failed_errno = errno;
@@ -437,7 +440,7 @@ fail_pipe:
 		 * At this point we know that fork() succeeded, but execvp()
 		 * failed. Errors have been reported to our stderr.
 		 */
-		wait_or_whine(cmd->pid, cmd->argv[0]);
+		wait_or_whine(cmd->pid, cmd->argv[0], 0);
 		failed_errno = errno;
 		cmd->pid = -1;
 	}
@@ -536,12 +539,18 @@ fail_pipe:
 
 int finish_command(struct child_process *cmd)
 {
-	int ret = wait_or_whine(cmd->pid, cmd->argv[0]);
+	int ret = wait_or_whine(cmd->pid, cmd->argv[0], 0);
 	argv_array_clear(&cmd->args);
 	argv_array_clear(&cmd->env_array);
 	return ret;
 }
 
+int finish_command_in_signal(struct child_process *cmd)
+{
+	return wait_or_whine(cmd->pid, cmd->argv[0], 1);
+}
+
+
 int run_command(struct child_process *cmd)
 {
 	int code;
@@ -772,7 +781,7 @@ error:
 int finish_async(struct async *async)
 {
 #ifdef NO_PTHREADS
-	return wait_or_whine(async->pid, "child process");
+	return wait_or_whine(async->pid, "child process", 0);
 #else
 	void *ret = (void *)(intptr_t)(-1);
 
diff --git a/run-command.h b/run-command.h
index 5b4425a3cbe1..275d35c442ac 100644
--- a/run-command.h
+++ b/run-command.h
@@ -50,6 +50,7 @@ void child_process_init(struct child_process *);
 
 int start_command(struct child_process *);
 int finish_command(struct child_process *);
+int finish_command_in_signal(struct child_process *);
 int run_command(struct child_process *);
 
 /*
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]