On 2020-01-20 at 23:53:22, Christoph Groth wrote: > Johannes Schindelin wrote: > > > > On Sat, 18 Jan 2020, Christoph Groth wrote: > > > > > OK, I see. But please consider (one day) to split up the index file > > > to separate the local stat cache from the globally valid data. > > > > I am sure that this has been considered even before Git was publicly > > announced, > > I would be very interested to hear the rationale for keeping the > information about what is staged and the stat cache together in the same > file. I, or someone else, might actually work on a patch one day, but > before starting, it would be good to understand the reasoning behind the > current design. > > > and I would wager a guess that it was determined that it would be > > better to keep all of Git's private data in one place. > > My point is that it’s not just private data: When I excluded .git/index > from synchronization, staging files for a commit was no longer > synchronized. To try to answer this question, Git stores all of its state about the working tree in the index. Bare repositories don't typically have an index because they don't have a working tree. Whether that state is staged contents or stat information, all of it is in one file. Storing all of this data in one file means that only one file need be mapped into memory and rewritten. Git writes to the index by atomically creating a lock file along side of it and writing the new contents into it, and then doing an atomic replace. This approach wouldn't be possible with multiple files, and any update to it wouldn't be atomic. There is support for a split index mode which means that the main index need not be rewritten as often, which is helpful when making small updates to large trees, where the cost of rewriting the index is significant. I don't know how locking is handled there[0], but I assume that it is, because the people who implemented and reviewed it are capable and thoughtful. However, having said that, nobody has provided a compelling case for using multiple files for storing different types of working tree state. The existing options are available for cases like yours and others', and they work. Since there are clear benefits to the current model, including simplicity and robustness, and few downsides, nobody has decided to change it. I should add that even if, for some reason, we did add support for splitting this data out, I'm not sure if we'd support syncing only part of the repository state and blowing away other state. We don't really support that now (other than through tools like fetch and clone) and I don't think we'd want to encourage that behavior in the future. [0] And I have not had the interest to look at this present moment. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature