Re: [PATCH] pack-objects: disable pack reuse for object-selection options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 08, 2017 at 01:56:27PM +0900, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > If certain options like --honor-pack-keep, --local, or
> > --incremental are used with pack-objects, then we need to
> > feed each potential object to want_object_in_pack() to see
> > if it should be filtered out.  This is totally contrary to
> > the purpose of the pack-reuse optimization, which tries hard
> > to avoid doing any per-object work.  Therefore we need to
> > disable this optimization when these options are in use.
> 
> I read this five times, as I wanted to understand what you are
> saying, but I am not sure if I got it right.  One of the reasons why
> I was confused was that I originally thought this "reuse" was about
> delta reuse, but it is not.  It is "sending a slice of the original
> packfile straight to the output".

Right. The "send a slice" goes under the name "reuse_packfile" in the
code, but I'm not surprised you're not familiar with it. We added it
long ago as part of the bitmap feature, and it's not often talked about
(and in its current incarnation, isn't actually very useful; I have
patches to improve that, but haven't gotten around to upstreaming them).

> But even after getting myself out
> of that confusion, I still do not see why we "need to disable".
> Surely, even if we need to exclude some objects from an existing
> packfile due to these selection options, we should be able to reuse
> the non-excluded part, no?  The end result may involve having to
> pick and reuse more and smaller slices from existing packfiles,
> which may be much less efficient, but it is no immediately obvious
> to me if it leads to "need to disable".  I would understand it if it
> were "it becomes much less efficient and we are better off not using
> the bitmap code at all", though.

Yes, it's this last bit. The main win of the packfile reuse code is that
it builds on the bitmaps to avoid doing as much per-object work as
possible. So the objects don't even get added to the list of "struct
object_entry", and we never consider them for the "should they be in the
result" checks beyond the have/want computation done by the bitmaps.

We could add those checks in, but what's the point? The idea of the
reuse code is to be a fast path for serving vanilla clones. Searching
through all of the packfiles for a .keep entry is the antithesis of
that.

> Is the real reason why we want to disable the "reuse" because it is
> not easy to update the reuse_partial_packfile_from_bitmap() logic to
> take these selection options into account?  If so, I would also
> understand why this is a good change.

This is a side concern for the current form. With the patches I
mentioned above, it's not too hard to omit some objects and still reuse
the other bits verbatim. But again, even evaluating the function for
each pack is too expensive, even if the implementation supported it.

> > +test_expect_success 'pack reuse respects --honor-pack-keep' '
> > +	test_when_finished "rm -f .git/objects/pack/*.keep" &&
> > +	for i in .git/objects/pack/*.pack; do
> > +		>${i%.pack}.keep
> > +	done &&
> 
> Micronit: style.

I assume you mean putting "do" on a separate line. Sorry, the "; do" is
my native tongue and I sometimes slip back into it without thinking.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]