On Wed, Dec 12, 2018 at 3:02 AM Jeff King <peff@xxxxxxxx> wrote: > > On Tue, Dec 11, 2018 at 04:25:15PM -0800, Josh Steadmon wrote: > > > From: Masaya Suzuki <masayasuzuki@xxxxxxxxxx> > > > > In the Git pack protocol definition, an error packet may appear only in > > a certain context. However, servers can face a runtime error (e.g. I/O > > error) at an arbitrary timing. This patch changes the protocol to allow > > an error packet to be sent instead of any packet. > > > > Following this protocol spec change, the error packet handling code is > > moved to pkt-line.c. > > This is a change in the spec with an accompanying change in the code, > which raises the question: what do other implementations do with this > change (both older Git, and implementations like JGit, libgit2, etc)? JGit is similar to Git. It parses "ERR " in limited places. When it sees an ERR packet in an unexpected place, it'll fail somewhere in the parsing code. https://github.com/eclipse/jgit/blob/30c6c7542190c149e2aee792f992a312a5fc5793/org.eclipse.jgit/src/org/eclipse/jgit/transport/PacketLineIn.java#L145-L147 https://github.com/eclipse/jgit/blob/f40b39345cd9b54473ee871bff401fe3d394ffe3/org.eclipse.jgit/src/org/eclipse/jgit/transport/BasePackConnection.java#L208 I'm not familiar with libgit2 code, but it seems it handles this at a lower level. An error type packet is parsed out at a low level, and the error handling is done by the callers of the packet parser. https://github.com/libgit2/libgit2/blob/bea65980c7a42e34edfafbdc40b199ba7b2a564e/src/transports/smart_pkt.c#L482-L483 I cannot find an ERR packet handling in go-git. It seems to me that if an ERR packet appears it treats it as a parsing error. https://github.com/src-d/go-git/blob/master/plumbing/protocol/packp/common.go#L60-L62 > > I think the answer for older Git is "hang up unceremoniously", which is > probably OK given the semantics of the change. And I'd suspect most > other implementations would do the same. I just wonder if anybody tested > it with other implementations. I'm thinking aloud here. There would be two aspects of the protocol compatibility: (1) new clients speak to old servers (2) old clients speak to a new server that speaks the updated protocol. For (1), I assume that in the Git pack protocol, a packet starting from "ERR " does not appear naturally except for a very special case that the server doesn't support sideband, but using the updated protocol. As you mentioned, at first it looks like this can mistakenly parse the pack file of git-receive-pack as an ERR packet, assuming that git-receive-pack's pack file is packetized. Actually git-receive-pack's pack file is not packetized in the Git pack protocol (https://github.com/git/git/blob/master/builtin/receive-pack.c#L1695). I recently wrote a Git protocol parser (https://github.com/google/gitprotocolio), and I confirmed that this is the case at least for the HTTP transport. git-upload-pack's pack file is indeed packetized, but packetized with sideband. Except for the case where sideband is not used, the packfiles wouldn't be considered as an ERR packet accidentally. For (2), if the old clients see an unexpected ERR packet, they cannot parse it. They would handle this unparsable data as if the server is not speaking Git protocol correctly. Even if the old clients just ignore the packet, due to the nature of the ERR packet, the server won't send further data. The client won't be able to proceed. Overall, the clients anyway face an error, and the only difference would be whether the clients can show an error nicely or not. The new clients will show the errors nicely to users. Old clients will not. > > > +An error packet is a special pkt-line that contains an error string. > > + > > +---- > > + error-line = PKT-LINE("ERR" SP explanation-text) > > +---- > > + > > +Throughout the protocol, where `PKT-LINE(...)` is expected, an error packet MAY > > +be sent. Once this packet is sent by a client or a server, the data transfer > > +process defined in this protocol is terminated. > > The packfile data is typically packetized, too, and contains arbitrary > data (that could have "ERR" in it). It looks like we don't specifically > say PKT-LINE() in that part of the protocol spec, though, so I think > this is OK. As I described above, as far as I can see, the packfile in git-upload-pack is not packetized. The packfile in git-receive-pack is packetized but typically with sideband. At least at the Git pack protocol level, this should be OK. > > Likewise, in the implementation: > > > diff --git a/pkt-line.c b/pkt-line.c > > index 04d10bbd03..ce9e42d10e 100644 > > --- a/pkt-line.c > > +++ b/pkt-line.c > > @@ -346,6 +346,10 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer, > > return PACKET_READ_EOF; > > } > > > > + if (starts_with(buffer, "ERR ")) { > > + die(_("remote error: %s"), buffer + 4); > > + } > > + > > if ((options & PACKET_READ_CHOMP_NEWLINE) && > > len && buffer[len-1] == '\n') > > len--; > > This ERR handling has been moved to a very low level. What happens if > we're passing arbitrary data via the packet_read() code? Could we > erroneously trigger an error if a packfile happens to have the bytes > "ERR " at a packet boundary? > > For packfiles via upload-pack, I _think_ we're OK, because we only > packetize it when a sideband is in use. In which case this would never > match, because we'd have "\1" in the first byte slot. > > But are there are other cases we need to worry about? Just > brainstorming, I can think of: > > 1. We also pass packetized packfiles between git-remote-https and > the stateless-rpc mode of fetch-pack/send-pack. And I don't think > we use sidebands there. > > 2. The packet code is used for long-lived clean/smudge filters these > days, which also pass arbitrary data. > > So I think it's probably not a good idea to unconditionally have callers > of packet_read_with_status() handle this. We'd need a flag like > PACKET_READ_RESPECT_ERR, and to trigger it from the appropriate callers. This is outside of the Git pack protocol so having a separate parsing mode makes sense to me. > > -Peff