On Sun, Jan 05, 2025 at 09:49:51PM +0800, shejialuo wrote: > We have already implemented the header consistency check for the raw > "packed-refs" file. Before we implement the consistency check for each > ref entry, let's analysis [1] which reports that "git fsck" cannot > detect some binary zeros. > > "packed-backend.c::next_record" will use "check_refname_format" to check > the consistency of the refname. If it is not OK, the program will die. > So, we already have the code path and we must miss out something. > > We use the following code to get the refname: > > strbuf_add(&iter->refname_buf, p, eol - p); > iter->base.refname = iter->refname_buf.buf > > In the above code, `p` is the start pointer of the refname and `eol` is > the next newline pointer. We calculate the length of the refname by > subtracting the two pointers. Then we add the memory range between `p` > and `eol` to get the refname. > > However, if there are some NULL binaries in the memory range between `p` You probably mean NUL characters, not NULL binaries? > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index 3b11abe5f8..f6142a4402 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -493,6 +493,23 @@ static void verify_buffer_safe(struct snapshot *snapshot) > last_line, eof - last_line); > } > > +/* > + * When parsing the "packed-refs" file, we will parse it line by line. > + * Because we know the start pointer of the refname and the next > + * newline pointer, we could calculate the length of the refname by > + * subtracting the two pointers. However, there is a corner case where > + * the refname contains corrupted embedded NULL binaries. And > + * `check_refname_format()` will not catch this when the truncated > + * refname is still a valid refname. To prevent this, we need to check > + * whether the refname contains the NULL binaries. > + */ > +static int refname_contains_null(struct strbuf refname) > +{ > + if (refname.len != strlen(refname.buf)) > + return 1; > + return 0; > +} > + > #define SMALL_FILE_SIZE (32*1024) > > /* > @@ -894,6 +911,9 @@ static int next_record(struct packed_ref_iterator *iter) > strbuf_add(&iter->refname_buf, p, eol - p); > iter->base.refname = iter->refname_buf.buf; > > + if (refname_contains_null(iter->refname_buf)) We can replace this with `memchr(iter->refname_buf.buf, '\0', iter->refname_buf.len)`, which should be more efficient than using strlen(3p). > + die("packed refname contains embedded NULL: %s", iter->base.refname); > + I was a bit surprised to find that we modify the way that we read refs from the packed-refs file instead of adapting the fsck code. But I think this check is sensible. Patrick