Re: [PATCH/RFC v3 04/13] Add documentation of the index-v5 file format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Gummerer <t.gummerer@xxxxxxxxx> writes:

> +GIT index format
> +================
> +
> +== The git index file format
> +
> +   The git index file (.git/index) documents the status of the files
> +     in the git staging area.
> +
> +   The staging area is used for preparing commits, merging, etc.

The above two are not about "index file format".  It is an
explanation of what the index is.

> +   All binary numbers are in network byte order. Version 5 is described
> +     here.

I had to read between these two lines something like 

    ""The index file consists of various sections; the sections
    appear in the following order in the file."""

to make sense of the document.

> +   - A 20-byte header consisting of
> +
> +     sig (32-bits): Signature:
> +       The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache")
> +
> +     vnr (32-bits): Version number:
> +       The current supported versions are 2, 3, 4 and 5.
> +
> +     ndir (32-bits): number of directories in the index.
> +
> +     nfile (32-bits): number of file entries in the index.
> +
> +     fblockoffset (32-bits): offset to the file block, relative to the
> +       beginning of the file.

Ok.

> +   - Offset to the extensions.
>
> +     nextensions (32-bits): number of extensions.
> +
> +     extoffset (32-bits): offset to the extension. (Possibly none, as
> +       many as indicated in the 4-byte number of extensions)

OK.

> +     headercrc (32-bits): crc checksum for the header and extension
> +       offsets

This may have to have the same "  - <section title>" at the same
level as "A 20-byte header" and "Offset to the ext"; as it stands,
it looks as if it is part of "Offset to the ext" which consists of
12 bytes.

> +   - diroffsets (ndir * directory offsets): A directory offset for each
> +       of the ndir directories in the index, sorted by pathname (of the
> +       directory it's pointing to) (see below). The diroffsets are relative
> +       to the beginning of the direntries block. [1]

"ndir * diroffsets" confused me.  I think you meant to say that this
"diroffsets" section consists of ndir entries of something and that
each of that something is a directory offset.  It is unclear how "a
directory offset" is represented, except that it is "relative to the
beginning of direntry block" (and it is unclear what and where the
direntry block is from the information given up to this point) and
the reader can guess it is in "network byte order" (assuming it is a
binary number).  Perhaps

	diroffsets (ndir entries of "directory offset"): A 4-byte
	offset relative to the beginning of the "direntries block"
	(see below) for each of the ...

and drop the last sentence?

Other tables may want to be adjusted in a similar fashion.

> +== Directory offsets (diroffsets)
> +
> +  diroffset (32-bits): offset to the directory relative to the beginning
> +    of the index file. There are ndir + 1 offsets in the diroffset table,
> +    the last is pointing to the end of the last direntry. With this last
> +    entry, we can replace the strlen when reading each filename, by
> +    calculating its length with the offsets.

The mention of "strlen" looks very out of place.  The reader may be
able to guess that you want to say that the nth "string" is between
diroffset[n] and diroffset[n+1], and these "string"s are densely
packed so strlen(diroffset[n]) and diroffset[n+1]-diroffset[n] are
either the same thing (or with a fixed difference, if each "string"
is accompanied by some fixed-length data), but it is unclear what
these "strings" represent, especially because the name of the table
implies that you are talking about directories but strlen talks
about filename.

> +== Design explanations
> + ...
> +[3] The data of the cache-tree extension and the resolve undo
> +    extension is now part of the index itself, but if other extensions
> +    come up in the future, there is no need to change the index, they
> +    can simply be added at the end.

Interesting.  When we added extensions, we said that there is no
need to change the index to add new features, they can simply be
added at the end.  Perhaps the file offset table can be added as an
extension to v2 to give us the same bisectability, allowing us a
single entry in-place replacementability, without defining an
entirely different format?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]