Shawn Pearce <spearce@xxxxxxxxxxx> writes: > I have long wanted to scrap the current index format. I unfortunately > don't have the time to do it myself. But I suspect there may be a lot > of gains by making the index format match the canonical tree format > better by keeping the tree structure within a single file stream, > nesting entries below their parent directory, and keeping tree SHA-1 > data along with the directory entry. If I may add to this: the one thing that I would like to see fixed about the index is that it's flat out impossible to change a single thing in it without re"writing" it from scratch. I'm saying "writing" because it is possible to change a few things around, but recomputing the trailing SHA1 swamps that by a large margin unless you are writing to a floppy disk, so it doesn't matter. I'm sure using a CRC32 helps here, but if we're going to make an incompatible change, why not go all the way? A tree layout can fix that if it is properly arranged so that if you 'git add path/to/file', it only updates the SHA1s for path/to/file, path/to and path. For this to work, the checks would have to correspond to the trees, perhaps even directly use the actual tree SHA1. This would at least be natural in some sense; getting to actual log(n) complexity for hilariously large directories would require dynamically splitting directories where appropriate. Along the same lines the format should allow for changing the extension data for a single extension while only rehashing the new data. When I worked on cache-tree, I considered making a change to the latter effect, but thought the impact too great for a little gain. Now from this thread, I'm getting the impression that such a change would be ok, even if users would have to scrap the index if they downgrade. Is that right? -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html