When opening a loose object file, we often do this sequence: - prepare a short buffer for the object header (on stack) - call unpack_sha1_header() and have early part of the object data inflated, enough to fill the buffer - parse that data in the short buffer, assuming that the first part of the object is <type> SP <length> NUL Nobody in this sequence however actually verifies that the loop that tries to find SP that must come after the typename or NUL that must come after the length exist in the inflated data. Because the parsing function parse_sha1_header_extended() is not even given the number of bytes inflated into the header buffer, it can easily read past it, looking for the SP byte that may not even exist. A variant recently introduced to support "--allow-unknown-type" option of "git cat-file -t" changes the second step to use unpack_sha1_header_to_strbuf(), but the story is essentially the same. It did check to see if it saw enough to include NUL, but nobody checked for SP before calling the parsing function. To correct this, do these three things: - rename unpack_sha1_header() to unpack_sha1_short_header() and have unpack_sha1_header_to_strbuf() keep calling that as its helper function. This will detect and report zlib errors, but is not aware of the format of a loose object (as before). - introduce unpack_sha1_header() that calls the same helper function, and when zlib reports it inflated OK into the buffer, check if the buffer has both SP and NUL in this order. This would ensure that parsing function will terminate within the buffer that holds the inflated header. - update unpack_sha1_header_to_strbuf() to check if the resulting buffer has both SP and NUL in this order for the same effect. Reported-by: Gustavo Grieco <gustavo.grieco@xxxxxxx> Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> --- * Unlike the "something like this" version, this does the "we got some data, does it look like an object header, safely parseable by our parser?" check in the unpack code, without touching the parser, as I think that division of labor between the unpacker and the parser makes more sense. The strbuf codepath came in 46f03448 ("sha1_file: support reading from a loose object of unknown type", 2015-05-03) by Karthik, whose log says it was written by me, and helped by Peff, so I'm asking these two to lend their eyes. sha1_file.c | 40 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index b9c1fa3..445e763 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1646,7 +1646,9 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, return used; } -int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz) +static int unpack_sha1_short_header(git_zstream *stream, + unsigned char *map, unsigned long mapsize, + void *buffer, unsigned long bufsiz) { /* Get the data stream */ memset(stream, 0, sizeof(*stream)); @@ -1659,13 +1661,37 @@ int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long ma return git_inflate(stream, 0); } +int unpack_sha1_header(git_zstream *stream, + unsigned char *map, unsigned long mapsize, + void *buffer, unsigned long bufsiz) +{ + const char *eoh; + int status = unpack_sha1_short_header(stream, map, mapsize, + buffer, bufsiz); + + if (status < Z_OK) + return status; + + /* Make sure we have the terminating NUL */ + eoh = memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer); + if (!eoh) + return -1; + /* Make sure we have ' ' at the end of type */ + if (!memchr(buffer, ' ', eoh - (const char *)buffer)) + return -1; + return 0; +} + static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz, struct strbuf *header) { + const char *eoh; int status; - status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz); + status = unpack_sha1_short_header(stream, map, mapsize, buffer, bufsiz); + if (status < Z_OK) + return -1; /* * Check if entire header is unpacked in the first iteration. @@ -1686,11 +1712,19 @@ static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char *map, status = git_inflate(stream, 0); strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer); if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer)) - return 0; + goto enough; stream->next_out = buffer; stream->avail_out = bufsiz; } while (status != Z_STREAM_END); return -1; + +enough: + eoh = memchr(header->buf, '\0', header->len); + if (!eoh) + die("BUG: the NUL we earlier saw is gone???"); + if (!memchr(header->buf, ' ', eoh - header->buf)) + return -1; + return 0; } static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long size, const unsigned char *sha1)