Re: [PATCH v3 1/4] pack-protocol.txt: accept error packets in any context

Jeff King <peff@xxxxxxxx> · Wed, 12 Dec 2018 06:02:07 -0500

On Tue, Dec 11, 2018 at 04:25:15PM -0800, Josh Steadmon wrote:

> From: Masaya Suzuki <masayasuzuki@xxxxxxxxxx>
> 
> In the Git pack protocol definition, an error packet may appear only in
> a certain context. However, servers can face a runtime error (e.g. I/O
> error) at an arbitrary timing. This patch changes the protocol to allow
> an error packet to be sent instead of any packet.
> 
> Following this protocol spec change, the error packet handling code is
> moved to pkt-line.c.

This is a change in the spec with an accompanying change in the code,
which raises the question: what do other implementations do with this
change (both older Git, and implementations like JGit, libgit2, etc)?

I think the answer for older Git is "hang up unceremoniously", which is
probably OK given the semantics of the change. And I'd suspect most
other implementations would do the same. I just wonder if anybody tested
it with other implementations.

> +An error packet is a special pkt-line that contains an error string.
> +
> +----
> +  error-line     =  PKT-LINE("ERR" SP explanation-text)
> +----
> +
> +Throughout the protocol, where `PKT-LINE(...)` is expected, an error packet MAY
> +be sent. Once this packet is sent by a client or a server, the data transfer
> +process defined in this protocol is terminated.

The packfile data is typically packetized, too, and contains arbitrary
data (that could have "ERR" in it). It looks like we don't specifically
say PKT-LINE() in that part of the protocol spec, though, so I think
this is OK.

Likewise, in the implementation:

> diff --git a/pkt-line.c b/pkt-line.c
> index 04d10bbd03..ce9e42d10e 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -346,6 +346,10 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  		return PACKET_READ_EOF;
>  	}
>  
> +	if (starts_with(buffer, "ERR ")) {
> +		die(_("remote error: %s"), buffer + 4);
> +	}
> +
>  	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
>  	    len && buffer[len-1] == '\n')
>  		len--;

This ERR handling has been moved to a very low level. What happens if
we're passing arbitrary data via the packet_read() code? Could we
erroneously trigger an error if a packfile happens to have the bytes
"ERR " at a packet boundary?

For packfiles via upload-pack, I _think_ we're OK, because we only
packetize it when a sideband is in use. In which case this would never
match, because we'd have "\1" in the first byte slot.

But are there are other cases we need to worry about? Just
brainstorming, I can think of:

  1. We also pass packetized packfiles between git-remote-https and
     the stateless-rpc mode of fetch-pack/send-pack. And I don't think
     we use sidebands there.

  2. The packet code is used for long-lived clean/smudge filters these
     days, which also pass arbitrary data.

So I think it's probably not a good idea to unconditionally have callers
of packet_read_with_status() handle this. We'd need a flag like
PACKET_READ_RESPECT_ERR, and to trigger it from the appropriate callers.

-Peff