Re: [RFC PATCH] vreportf: ensure sensible ordering of normal and error output

Jeff King <peff@xxxxxxxx> · Tue, 30 Nov 2021 02:14:25 -0500

On Mon, Nov 29, 2021 at 09:13:10PM -0800, Junio C Hamano wrote:

> Eric Sunshine <sunshine@xxxxxxxxxxxxxx> writes:
> 
> > This is RFC because I naturally worry about potential fallout from
> > making a change to such a core function. I can't think of any case that
> > it wouldn't be advantageous to flush stdout before stderr, so this
> > change _seems_ safe, however, it may be that I'm just not imaginative
> > enough, hence my hesitancy.
> 
> If stdout and stderr are both going to the same place (e.g. the
> user's terminal), this would probably is an improvement, but if the
> standard output is going to a pipe talking to another process, which
> may care when the output is flushed, this may hurt.
> 
> But as long as the calling code is using stdio, it cannot precisely
> control when the buffered contents are flushed anyway, so as long as
> the caller has working standard output, this may be OK.

Yeah, I think this logic applies to the "happy" case. Any caller which
is depending on the time of flush is already racily buggy.

What I wonder about is the error case. What can happen if flushing
fails? There are two interesting cases I can think of:

  - flushing causes an error (which is quite likely, as we may
    vreportf() because of an error on stdout). We should be OK, as we do
    not care about the return value here, nor eventually checking
    ferror(stdout). We may overwrite errno, but at this point in
    vreportf(), we are committed to whatever error we're going to show
    (and obviously the stderr flush below could cause the same issues).

  - flushing causes us to block. This implies our stdout is connected to
    a pipe or socket, and the other side is not expecting to read. A
    plausible case here is a client sending us a big input which we find
    to be bogus (maybe index-pack checking an incoming pack). We call
    die() to complain about the input, but the client is still writing.
    In the current code, we'd write out our error and then exit; the
    client would get SIGPIPE or a write() error and abort. But with a
    flush here, we could block writing back to the client, and now we're
    in a deadlock; they are trying to write to us but we are no longer
    reading, and we are blocked trying to get out a little bit of
    irrelevant stdout data.

    I _think_ we're probably OK here. The scenario above means that the
    caller is already doing asynchronous I/O via stdio and is subject to
    deadlock. Because the segment of buffer we try to flush here _could_
    have been flushed already under the hood, which would have caused
    the same blocking. A careful caller might be using select() or
    similar to decide when it is OK to write, but I find it highly
    unlikely they'd be using stdio in that case.

Of the two, the deadlock case worries me more, just because it would be
quiet subtle and racy. As I said, I think we may be OK, but my reasoning
there is pretty hand-wavy.

-Peff