Re: [PATCH 3/6] Stop producing index version 2

Thomas Rast <trast@xxxxxxxxxxx> · Tue, 7 Feb 2012 18:25:43 +0100

Shawn Pearce <spearce@xxxxxxxxxxx> writes:

> I have long wanted to scrap the current index format. I unfortunately
> don't have the time to do it myself. But I suspect there may be a lot
> of gains by making the index format match the canonical tree format
> better by keeping the tree structure within a single file stream,
> nesting entries below their parent directory, and keeping tree SHA-1
> data along with the directory entry.

If I may add to this: the one thing that I would like to see fixed about
the index is that it's flat out impossible to change a single thing in
it without re"writing" it from scratch.

I'm saying "writing" because it is possible to change a few things
around, but recomputing the trailing SHA1 swamps that by a large margin
unless you are writing to a floppy disk, so it doesn't matter.  I'm sure
using a CRC32 helps here, but if we're going to make an incompatible
change, why not go all the way?

A tree layout can fix that if it is properly arranged so that if you
'git add path/to/file', it only updates the SHA1s for path/to/file,
path/to and path.  For this to work, the checks would have to correspond
to the trees, perhaps even directly use the actual tree SHA1.  This
would at least be natural in some sense; getting to actual log(n)
complexity for hilariously large directories would require dynamically
splitting directories where appropriate.

Along the same lines the format should allow for changing the extension
data for a single extension while only rehashing the new data.

When I worked on cache-tree, I considered making a change to the latter
effect, but thought the impact too great for a little gain.  Now from
this thread, I'm getting the impression that such a change would be ok,
even if users would have to scrap the index if they downgrade.  Is that
right?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html