Re: [PATCH 03/10] builtin/pack-objects.c: learn '--assume-kept-packs-closed'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 29, 2021 at 06:28:28PM -0500, Taylor Blau wrote:
> On Fri, Jan 29, 2021 at 03:03:08PM -0800, Junio C Hamano wrote:
> > Are our goals still include that the resulting packfile has good
> > delta compression and object locality?  Reachability traversal
> > discovers which commit comes close to which other commits to help
> > pack-objects to arrange the resulting pack so that objects that
> > appear close together in history appears close together.  It also
> > gives each object a pathname hint to help group objects of the same
> > type (either blobs or trees) with like-paths together for better
> > deltification.
>
> I think our goals here are somewhere between having fewer packfiles
> while also ensuring that the packfiles we had to create don't have
> horrible delta compression and locality.
>
> But now that you do mention it, I remember the reachability traversal's
> bringing in object names was a reason that we decided to implement this
> series using a reachability traversal in the first place.

Peff shared a very clever idea with me today. Like in the naive
approach, we fill the list of "objects to pack" with everything in the
packs that are about to get rolled up, excluding anything that appears
in the large packs.

But we do a reachability traversal whose starting points are all of the
commits in the packs that are about to be rolled up, filling in the
namehash of the objects we encounter along the way.

Like in the original version of this series, we'll stop early once we
encounter an object in any of the frozen packs (which are marked as kept
in core), and so we might not traverse through everything. But that's
completely OK, since we know we have the right list of objects to pack
(at worst, we would having some zero'd namehashes and come up with
slightly worse deltas).

But, I think that this is a nice middle-ground (and it allows us to
reuse lots of work from the original version), so I'm quite happy.

It's in my fork [1] in the tb/geometric-repack.wip branch, but I'll try
and clean those patches up tomorrow and send a v2 to the list.

Thanks,
Taylor

[1]: https://github.com/ttaylorr/git



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux