Shawn O. Pearce wrote: > Liu Yubao <yubao.liu@xxxxxxxxx> wrote: >> diff --git a/sha1_file.c b/sha1_file.c >> index 6c0e251..efe6967 100644 >> --- a/sha1_file.c >> +++ b/sha1_file.c >> @@ -1254,10 +1255,10 @@ static int parse_sha1_header(const char *hdr, unsigned long *sizep) >> /* >> * The type can be at most ten bytes (including the >> * terminating '\0' that we add), and is followed by >> - * a space. >> + * a space, at least one byte for size, and a '\0'. >> */ >> i = 0; >> - for (;;) { >> + while (hdr < hdr_end - 2) { >> char c = *hdr++; >> if (c == ' ') >> break; >> @@ -1265,6 +1266,8 @@ static int parse_sha1_header(const char *hdr, unsigned long *sizep) >> if (i >= sizeof(type)) >> return -1; > > That first hunk I am citing is unnecessary, because of the lines > right above. All of the callers of this function pass in a buffer > that is at least 32 bytes in size; this loop aborts if it does not > find a ' ' within the first 10 bytes of the buffer. We'll never > access memory outside of the buffer during this loop. > > So IMHO your first three hunks here aren't necessary. > Seems you missed the cover letter sent as patch 0/5, all patches are explained in the cover letter, sorry I sent them as separate topics by mistake. This bound check is mainly for uncompressed loose object, a loose object that just are uncompressed: uncompressed loose object = inflate(loose object) loose object = deflate(typename + <space> + size + '\0' + data) I'm doing a defensive programming, for uncompressed loose object the mmapped memory is passed to parse_sha1_header without being checked by inflateInit() first, so there may be a SIGSEGV crash for a corrupted uncompressed loose object. >> @@ -1275,7 +1278,7 @@ static int parse_sha1_header(const char *hdr, unsigned long *sizep) >> if (size > 9) >> return -1; >> if (size) { >> - for (;;) { >> + while (hdr < hdr_end - 1) { >> unsigned long c = *hdr - '0'; >> if (c > 9) >> break; > > OK, there's no promise here that we don't roll off the buffer. > > This can be fixed in the caller, ensuring we always have the '\0' > at some point in the initial header buffer we were asked to parse: > Isn't it easier to solve this problem in one place and maintain it? Maybe someday someone forgets parse_sha1_header requires a null terminated buffer, and a corrupted uncompressed loose object even doesn't have to be null terminated (if there will be this kind of loose object). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html