On Wed, Apr 06, 2011 at 01:54:13PM -0400, Jeff King wrote: > So I am wondering if we could simply drop the fflush(NULL) entirely in > the start_command case. And in the start_async case, move it inside the > NO_PTHREADS case. > > I guess the fflush does do one other thing; it makes sure that output on > a single descriptor is ordered sensibly. And we would be losing that. After reading 13af8cb (start_command: flush buffers in the WIN32 code path as well, 2011-02-04), I think dropping the fflush is a bad idea. Let's do the simple and safe fix, and if this type of problem actually comes up more than once, then I'll think about over-engineering an abstraction to fix it. :) Here it is with a nice commit message. -- >8 -- Subject: [PATCH] upload-pack: start pack-objects before async rev-list In a pthread-enabled version of upload-pack, there's a race condition that can cause a deadlock on the fflush(NULL) we call from run-command. What happens is this: 1. Upload-pack is informed we are doing a shallow clone. 2. We call start_async() to spawn a thread that will generate rev-list results to feed to pack-objects. It gets a file descriptor to a pipe which will eventually hook to pack-objects. 3. The rev-list thread uses fdopen to create a new output stream around the fd we gave it, called pack_pipe. 4. The thread writes results to pack_pipe. Outside of our control, libc is doing locking on the stream. We keep writing until the OS pipe buffer is full, and then we block in write(), still holding the lock. 5. The main thread now uses start_command to spawn pack-objects. Before forking, it calls fflush(NULL) to flush every stdio output buffer. It blocks trying to get the lock on pack_pipe. And we have a deadlock. The thread will block until somebody starts reading from the pipe. But nobody will read from the pipe until we finish flushing to the pipe. To fix this, we swap the start order: we start the pack-objects reader first, and then the rev-list writer after. Thus the problematic fflush(NULL) happens before we even open the new file descriptor (and even if it didn't, flushing should no longer block, as the reader at the end of the pipe is now active). Signed-off-by: Jeff King <peff@xxxxxxxx> --- And the result is one line shorter, so it _must_ be good. upload-pack.c | 23 +++++++++++------------ 1 files changed, 11 insertions(+), 12 deletions(-) diff --git a/upload-pack.c b/upload-pack.c index bba053f..ce5cbbe 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -157,15 +157,8 @@ static void create_pack_file(void) const char *argv[10]; int arg = 0; - if (shallow_nr) { - memset(&rev_list, 0, sizeof(rev_list)); - rev_list.proc = do_rev_list; - rev_list.out = -1; - if (start_async(&rev_list)) - die("git upload-pack: unable to fork git-rev-list"); - argv[arg++] = "pack-objects"; - } else { - argv[arg++] = "pack-objects"; + argv[arg++] = "pack-objects"; + if (!shallow_nr) { argv[arg++] = "--revs"; if (create_full_pack) argv[arg++] = "--all"; @@ -183,7 +176,7 @@ static void create_pack_file(void) argv[arg++] = NULL; memset(&pack_objects, 0, sizeof(pack_objects)); - pack_objects.in = shallow_nr ? rev_list.out : -1; + pack_objects.in = -1; pack_objects.out = -1; pack_objects.err = -1; pack_objects.git_cmd = 1; @@ -192,8 +185,14 @@ static void create_pack_file(void) if (start_command(&pack_objects)) die("git upload-pack: unable to fork git-pack-objects"); - /* pass on revisions we (don't) want */ - if (!shallow_nr) { + if (shallow_nr) { + memset(&rev_list, 0, sizeof(rev_list)); + rev_list.proc = do_rev_list; + rev_list.out = pack_objects.in; + if (start_async(&rev_list)) + die("git upload-pack: unable to fork git-rev-list"); + } + else { FILE *pipe_fd = xfdopen(pack_objects.in, "w"); if (!create_full_pack) { int i; -- 1.7.4.3.13.g0b769.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html