Re: [PATCH 00/12] Integrate commit-graph into fsck, gc, and fetch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 10 2018, Derrick Stolee wrote:

> The behavior in this patch series does the following:
>
> 1. Near the end of 'git gc', run 'git commit-graph write'. The location
>    of this code assumes that a 'git gc --auto' has not terminated early
>    due to not meeting the auto threshold.
>
> 2. At the end of 'git fetch', run 'git commit-graph write'. This means
>    that every reachable commit will be in the commit-graph after a
>    a successful fetch, which seems a reasonable frequency. Then, the
>    only times we would be missing a reachable commit is after creating
>    one locally. There is a problem with the current patch, though: every
>    'git fetch' call runs 'git commit-graph write', even if there were no
>    ref updates or objects downloaded. Is there a simple way to detect if
>    the fetch was non-trivial?
>
> One obvious problem with this approach: if we compute this during 'gc'
> AND 'fetch', there will be times where a 'fetch' calls 'gc' and triggers
> two commit-graph writes. If I were to abandon one of these patches, it
> would be the 'fetch' integration. A 'git gc' really wants to delete all
> references to unreachable commits, and without updating the commit-graph
> we may still have commit data in the commit-graph file that is not in
> the object database. In fact, deleting commits from the object database
> but not from the commit-graph will cause 'git commit-graph verify' to
> fail!
>
> I welcome discussion on these ideas, as we are venturing out of the
> "pure data structure" world and into the "user experience" world. I am
> less confident in my skills in this world, but the feature is worthless
> if it does not improve the user experience.

I really like #1 here, but I wonder why #2 is necessary.

I.e. is it critical for the performance of the commit graph feature that
it be kept really up-to-date, moreso than other things that rely on gc
--auto (e.g. the optional bitmap index)?

Even if that's the case, I think something that does this via gc --auto
is a much better option. I.e. now we have gc.auto & gc.autoPackLimit, if
the answer to my question above is "yes" this could also be accomplished
by introducing a new graph-specific gc.* setting, and --auto would just
update the graph more often, but leave the rest.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux