Re: [PATCH 2/2] Documentation/gitformat-pack.txt: fix incorrect MIDX documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 12, 2023 at 02:54:17PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
>
> > diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt
> > index d7153962d4..54000c9412 100644
> > --- a/Documentation/gitformat-pack.txt
> > +++ b/Documentation/gitformat-pack.txt
> > @@ -392,8 +392,9 @@ CHUNK DATA:
> >  	Packfile Names (ID: {'P', 'N', 'A', 'M'})
> >  	    Stores the packfile names as concatenated, NUL-terminated strings.
>
> Not a problem this series (neither this or the previous step)
> introduces, but I had to read the implementation of
> write_midx_pack_names() to find out what "concatenated
> NUL-terminated string" really means.  The code has a list of
> strings, writes each of them as a NUL-terminated string in sequence,
> and to align the beginning of the next chunk, NULs are added to make
> the whole thing multiple of MIDX_CHUNK_ALIGNMENT bytes.
>
> A naive reader code might implement a loop like so:
>
> 	while (ptr[0] != '\0') {
> 		endp = strlen(ptr);
> 		... ptr[0..endp] is one pathname ...
> 		ptr = endp + 1;
> 	}
>
> expecting that the terminating NUL of the last entry would be
> followed by a NUL, but that is buggy.  The sum of the pathname
> strings (with one NUL after each) may happen to be multiple of
> MIDX_CHUNK_ALIGNMENT bytes, in which case no extra padding NUL bytes
> will be there.  So the reader also needs to pay attention to the
> chunk size to notice when to stop reading.  It feels somewhat
> suboptimal.

I agree.

> >  	    Packfiles must be listed in lexicographic order for fast lookups by
> > -	    name. This is the only chunk not guaranteed to be a multiple of four
> > -	    bytes in length, so should be the last chunk for alignment reasons.
> > +	    name. Individual entries in this chunk are not guarenteed to be
> > +	    aligned. The chunk is externally padded with zeros to align
> > +	    remaining chunks.
>
> I am not sure what "externally padded" means.

How about something like this, instead?

--- 8< ---
diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt
index 0bc80f0d46..229490f82f 100644
--- a/Documentation/gitformat-pack.txt
+++ b/Documentation/gitformat-pack.txt
@@ -392,9 +392,10 @@ CHUNK DATA:
 	Packfile Names (ID: {'P', 'N', 'A', 'M'})
 	    Stores the packfile names as concatenated, NUL-terminated strings.
 	    Packfiles must be listed in lexicographic order for fast lookups by
-	    name. Individual entries in this chunk are not guarenteed to be
-	    aligned. The chunk is externally padded with zeros to align
-	    remaining chunks.
+	    name. Individual entries in this chunk are not guaranteed to be
+	    aligned, since the packfile names can be of arbitrary length. The
+	    chunk itself is padded at the end with NUL bytes in order to align
+	    the remaining chunks.

 	OID Fanout (ID: {'O', 'I', 'D', 'F'})
 	    The ith entry, F[i], stores the number of OIDs with first
--- >8 ---

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux