On Thu, Oct 12, 2023 at 02:54:17PM -0700, Junio C Hamano wrote: > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt > > index d7153962d4..54000c9412 100644 > > --- a/Documentation/gitformat-pack.txt > > +++ b/Documentation/gitformat-pack.txt > > @@ -392,8 +392,9 @@ CHUNK DATA: > > Packfile Names (ID: {'P', 'N', 'A', 'M'}) > > Stores the packfile names as concatenated, NUL-terminated strings. > > Not a problem this series (neither this or the previous step) > introduces, but I had to read the implementation of > write_midx_pack_names() to find out what "concatenated > NUL-terminated string" really means. The code has a list of > strings, writes each of them as a NUL-terminated string in sequence, > and to align the beginning of the next chunk, NULs are added to make > the whole thing multiple of MIDX_CHUNK_ALIGNMENT bytes. > > A naive reader code might implement a loop like so: > > while (ptr[0] != '\0') { > endp = strlen(ptr); > ... ptr[0..endp] is one pathname ... > ptr = endp + 1; > } > > expecting that the terminating NUL of the last entry would be > followed by a NUL, but that is buggy. The sum of the pathname > strings (with one NUL after each) may happen to be multiple of > MIDX_CHUNK_ALIGNMENT bytes, in which case no extra padding NUL bytes > will be there. So the reader also needs to pay attention to the > chunk size to notice when to stop reading. It feels somewhat > suboptimal. I agree. > > Packfiles must be listed in lexicographic order for fast lookups by > > - name. This is the only chunk not guaranteed to be a multiple of four > > - bytes in length, so should be the last chunk for alignment reasons. > > + name. Individual entries in this chunk are not guarenteed to be > > + aligned. The chunk is externally padded with zeros to align > > + remaining chunks. > > I am not sure what "externally padded" means. How about something like this, instead? --- 8< --- diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt index 0bc80f0d46..229490f82f 100644 --- a/Documentation/gitformat-pack.txt +++ b/Documentation/gitformat-pack.txt @@ -392,9 +392,10 @@ CHUNK DATA: Packfile Names (ID: {'P', 'N', 'A', 'M'}) Stores the packfile names as concatenated, NUL-terminated strings. Packfiles must be listed in lexicographic order for fast lookups by - name. Individual entries in this chunk are not guarenteed to be - aligned. The chunk is externally padded with zeros to align - remaining chunks. + name. Individual entries in this chunk are not guaranteed to be + aligned, since the packfile names can be of arbitrary length. The + chunk itself is padded at the end with NUL bytes in order to align + the remaining chunks. OID Fanout (ID: {'O', 'I', 'D', 'F'}) The ith entry, F[i], stores the number of OIDs with first --- >8 --- Thanks, Taylor