On Fri, Jan 06, 2017 at 03:39:59PM +0100, Johannes Sixt wrote: > > diff --git a/run-command.c b/run-command.c > > index ca905a9e80..db47c429b7 100644 > > --- a/run-command.c > > +++ b/run-command.c > > @@ -29,6 +29,8 @@ static int installed_child_cleanup_handler; > > > > static void cleanup_children(int sig, int in_signal) > > { > > + struct child_to_clean *children_to_wait_for = NULL; > > + > > while (children_to_clean) { > > struct child_to_clean *p = children_to_clean; > > children_to_clean = p->next; > > @@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal) > > } > > > > kill(p->pid, sig); > > + p->next = children_to_wait_for; > > + children_to_wait_for = p; > > + } > > + > > + while (children_to_wait_for) { > > + struct child_to_clean *p = children_to_wait_for; > > + children_to_wait_for = p->next; > > + > > + while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR) > > + ; /* spin waiting for process exit or error */ > > + > > if (!in_signal) > > free(p); > > } > > > > This looks like the minimal change necessary. I wonder, though, whether the > new local variable is really required. Wouldn't it be sufficient to walk the > children_to_clean chain twice? Yeah, I considered that. The fact that we disassemble the list in the first loop has two side effects: 1. It lets us free the list as we go (for the !in_signal case). 2. If we were to get another signal, it makes us sort-of reentrant. We will only kill and wait for each pid once. Obviously (1) moves down to the lower loop, but I was trying to preserve (2). I'm not sure if it is worth bothering, though. The way we pull items off of the list is certainly not atomic (it does shorten the race to a few instructions, though, versus potentially waiting on waitpid() to return). My bigger concern with the whole thing is whether we could hit some sort of deadlock if the child doesn't die when we send it a signal. E.g., imagine we have a pipe open to the child and somebody sends SIGTERM to us. We propagate SIGTERM to the child, and then waitpid() for it. The child decides to ignore our SIGTERM for some reason and keep reading until EOF on the pipe. It won't ever get it, and the two processes will hang forever. You can argue perhaps that the child is broken in that case. And I doubt this could trigger when running a git sub-command. But we may add more children in the future. Right now we use it for the new multi-file clean/smudge filters. They use the hook feature to close the descriptors, but note that that won't run in the in_signal case. So I dunno. Maybe this waiting should be restricted only to certain cases like executing git sub-commands. -Peff