Re: [PATCH 1/1] commit-graph.c: avoid unnecessary tag dereference when merging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 21, 2020 at 08:23:13PM -0400, Derrick Stolee wrote:

> On 3/21/2020 8:20 PM, Taylor Blau wrote:
> > On Sat, Mar 21, 2020 at 08:03:01PM -0400, Derrick Stolee wrote:
> >> On 3/21/2020 2:50 PM, Junio C Hamano wrote:
> >>> Do we need to worry about INFO_QUICK and SKIP_FETCH_OBJECT in this
> >>> codepath, by the way?
> >>
> >> I was coming back to this thread to bring up these exact flags for
> >> consideration. The good news is that in a partial clone with any
> >> amount of filtering we will still have all reachable commits, which
> >> are necessary for the commit-graph to make sense. The only ones that
> >> would fail has_object_file() are ones removed by GC, but they may
> >> still exist on the remote. So without SKIP_FETCH_OBJECT, we would
> >> generate a network call even if the server has GC'd to remove the
> >> commits. This gets particularly bad when the server returns all
> >> reachable objects from that commit!
> > 
> > That makes sense. Do you think something like this should be applied?
> > +       int flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
> [...]
> 
> Yes, I think this is the appropriate update. Thanks!

I'm not so sure.

In a non-partial clone, when would expect QUICK to matter? If we find
the object is missing, then either:

  - we are racing with repack, and we would benefit from double-checking
    that the object really is gone

  - we used to have it (since it was mentioned in the graph file) but it
    went away

Using QUICK means we won't waste time double-checking in the second
case. But it means we won't catch the first case, and we may generate a
new graph file that omits the object. They're both optimizations, and I
don't think we're impacting correctness[1], but it's not clear to me
whether one is a win over the other. We don't generally expect objects
we have to go away often.

Skipping fetching seems a little more straight-forward to me. If we had
it in a graph file, presumably we had the actual object before, too. And
either we're in the first case above (we really do have it and just need
to double-check) in which case not saying QUICK would be enough. Or we
intentionally got rid of it. In which case downloading it just to
generate a cache is quite silly.

So it seems like SKIP_FETCH_OBJECT but _not_ QUICK might be reasonable
here.

-Peff

[1] I'm actually not quite sure about correctness here. It should be
    fine to generate a graph file without any given commit; readers will
    just have to load that commit the old-fashioned way. But at this
    phase of "commit-graph write", I think we'll already have done the
    close_reachable() check. What does it mean to throw away a commit at
    this stage? If we're the parent of another commit, then it will have
    trouble referring to us by a uint32_t. Will the actual writing phase
    barf, or will we generate an invalid graph file?




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux