On Thu, May 10 2018, Derrick Stolee wrote: > The behavior in this patch series does the following: > > 1. Near the end of 'git gc', run 'git commit-graph write'. The location > of this code assumes that a 'git gc --auto' has not terminated early > due to not meeting the auto threshold. > > 2. At the end of 'git fetch', run 'git commit-graph write'. This means > that every reachable commit will be in the commit-graph after a > a successful fetch, which seems a reasonable frequency. Then, the > only times we would be missing a reachable commit is after creating > one locally. There is a problem with the current patch, though: every > 'git fetch' call runs 'git commit-graph write', even if there were no > ref updates or objects downloaded. Is there a simple way to detect if > the fetch was non-trivial? > > One obvious problem with this approach: if we compute this during 'gc' > AND 'fetch', there will be times where a 'fetch' calls 'gc' and triggers > two commit-graph writes. If I were to abandon one of these patches, it > would be the 'fetch' integration. A 'git gc' really wants to delete all > references to unreachable commits, and without updating the commit-graph > we may still have commit data in the commit-graph file that is not in > the object database. In fact, deleting commits from the object database > but not from the commit-graph will cause 'git commit-graph verify' to > fail! > > I welcome discussion on these ideas, as we are venturing out of the > "pure data structure" world and into the "user experience" world. I am > less confident in my skills in this world, but the feature is worthless > if it does not improve the user experience. I really like #1 here, but I wonder why #2 is necessary. I.e. is it critical for the performance of the commit graph feature that it be kept really up-to-date, moreso than other things that rely on gc --auto (e.g. the optional bitmap index)? Even if that's the case, I think something that does this via gc --auto is a much better option. I.e. now we have gc.auto & gc.autoPackLimit, if the answer to my question above is "yes" this could also be accomplished by introducing a new graph-specific gc.* setting, and --auto would just update the graph more often, but leave the rest.