On Sun, Jan 06, 2008 at 02:00:57PM -0800, Junio C Hamano wrote: > Martin Koegler <mkoegler@xxxxxxxxxxxxxxxxx> writes: > > diff --git a/commit.c b/commit.c > > index f074811..ffa0894 100644 > > --- a/commit.c > > +++ b/commit.c > > @@ -48,19 +48,33 @@ struct commit *lookup_commit(const unsigned char *sha1) > > return check_commit(obj, sha1, 0); > > } > > > > -static unsigned long parse_commit_date(const char *buf) > > +static unsigned long parse_commit_date(const char *buf, const char* tail) > > Should be "const char *tail" in our codebase. Will fix. > > { > > unsigned long date; > > + char datebuf[20]; > > + unsigned long len; > > > > + if (buf + 6 >= tail) > > + return 0; > > if (memcmp(buf, "author", 6)) > > return 0; > > Even though buf, which is a result from read_sha1_file(), is > always terminated with an extra NUL (outside its object size), > if a bogus commit object ends with "author" (and without the > author information) this part will pass, and ... fsck_commit (builtin-fsck.c) does quite redundant checks (except that it checks for author too). Should I make parse_commit fail, if there is no author and commiter line? This way, we could remove | if (memcmp(buffer, "tree ", 5)) | return objerror(&commit->object, "invalid format - expected 'tree' line"); | if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n') | return objerror(&commit->object, "invalid 'tree' line format - bad sha1"); | buffer += 46; | while (!memcmp(buffer, "parent ", 7)) { | if (get_sha1_hex(buffer+7, sha1) || buffer[47] != '\n') | return objerror(&commit->object, "invalid 'parent' line format - bad sha1"); | buffer += 48; | } | if (memcmp(buffer, "author ", 7)) | return objerror(&commit->object, "invalid format - expected 'author' line"); from fsck_commit. > > - while (*buf++ != '\n') > > + while (buf < tail && *buf++ != '\n') > > /* nada */; > > + if (buf + 9 >= tail) > > + return 0; > > ... you catch that here. That seems like a good change. > > > if (memcmp(buf, "committer", 9)) > > return 0; > > - while (*buf++ != '>') > > + while (buf < tail && *buf++ != '>') > > /* nada */; > > - date = strtoul(buf, NULL, 10); > > + if (buf >= tail) > > + return 0; > > Likewise here. > > > + len = tail - buf; > > + if (len > sizeof(datebuf) - 1) > > + len = sizeof(datebuf) - 1; > > Broken indentation. Will fix. > > + memcpy(datebuf, buf, len); > > + datebuf[len] = 0; > > + date = strtoul(datebuf, NULL, 10); > > However, as long as buf at this point hasn't go beyond tail, > which you already checked, I think we can rely on strtoul() > stopping at the NUL at the end of buffer (that is one beyond > tail), without this extra memcpy(). Am I mistaken? No. parse_commit_buffer is only called from parse_commit (safe, as it uses a buffer by read_sha1_file) and parse_object_buffer (safe, as it is called by parse_object (safe as it uses a buffer by read_sha1_file) and get_obj (in builtin-for-each-ref.c, safe as it uses a buffer by read_sha1_file)). > > @@ -275,7 +289,7 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size) > > n_refs++; > > } > > } > > - item->date = parse_commit_date(bufptr); > > + item->date = parse_commit_date(bufptr, tail); > > > > if (track_object_refs) { > > unsigned i = 0; > > -- > > 1.4.4.4 > > When already somewhat deep in the rc cycle, looking at a patch > from somebody who uses 1.4.4.4 makes me look at the patch a bit > more carefully than usual ;-) stg is much simpler to use for such patches. My distribution [Linus already called it insafe for it's git version] ships stg 0.11, which is sufficient for this. I simply use stg with the shipped git 1.4.4.4 to avoid any compatibiltiy problems. mfg Martin Kögler PS: The function of get_obj in builtin-for-each-ref.c looks quite similar to parse_object. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html