Re: [RFC] upload-pack deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> In a pthread-enabled version of upload-pack, there's a race condition
> that can cause a deadlock on the fflush(NULL) we call from run-command.
>
> What happens is this:
>
>   1. Upload-pack is informed we are doing a shallow clone.
>
>   2. We call start_async() to spawn a thread that will generate rev-list
>      results to feed to pack-objects. It gets a file descriptor to a
>      pipe which will eventually hook to pack-objects.
>
>   3. The rev-list thread uses fdopen to create a new output stream
>      around the fd we gave it, called pack_pipe.
>
>   4. The thread writes results to pack_pipe. Outside of our control,
>      libc is doing locking on the stream. We keep writing until the OS
>      pipe buffer is full, and then we block in write(), still holding
>      the lock.
>
>   5. The main thread now uses start_command to spawn pack-objects.
>      Before forking, it calls fflush(NULL) to flush every stdio output
>      buffer. It blocks trying to get the lock on pack_pipe.
>
> And we have a deadlock. The thread will block until somebody starts
> reading from the pipe. But nobody will read from the pipe until we
> finish flushing to the pipe.

Thanks for a concise summary.

> There are a few possible solutions:
>
>   1. Flip the order of initialization, so that we don't start writing
>      anything until the pack-objects reader is already in place. That
>      works in this case, and the patch is at the end of this mail. The
>      biggest problem is that it doesn't fix the general case.

In what sense are you not fixing "the general case", though?

If a program runs two threads, both of which access the FILE streams, it
should be the responsibility of the program to get these threads
coordinated to avoid such races and deadlocks, no?

While I think you mean start_async() API should be helping the API users
(calling programs) to fulfil that responsibility better, it really is up
to the thread to do random things to wreak havoc, like the shallow
codepath that tries to fill the stream while the main codepath expected
nothing happening, and I do not think of a good abstraction to help us
being more careful.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]