Re: [PATCH 1/2] builtin/repack.c: simplify cruft pack aggregation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 27, 2025 at 11:52 PM Patrick Steinhardt <ps@xxxxxx> wrote:
>
> On Thu, Feb 27, 2025 at 01:29:28PM -0500, Taylor Blau wrote:
> > In 37dc6d8104 (builtin/repack.c: implement support for
> > `--max-cruft-size`, 2023-10-02), 'git repack' built on support for
> > multiple cruft packs in Git by instructing 'git pack-objects --cruft'
> > how to aggregate smaller cruft packs up to the provided threshold.
> >
> > The implementation in 37dc6d8104 worked something like the following
> > pseudo-code:
> >
> >     total_size = 0;
> >
> >     for (p in cruft packs) {
> >       if (p->pack_size + total_size < max_size) {
> >         total_size += p->pack_size;
> >         collapse(p)
> >       } else {
> >         retain(p);
> >       }
> >     }
> >
> > The original idea behind this approach was that smaller cruft packs
> > would get combined together until the sum of their sizes was no larger
> > than the given max pack size.
> >
> > There is a much simpler way to achieve this, however, which is to simply
> > combine *all* cruft packs which are smaller than the threshold,
> > regardless of what their sum is. With '--max-pack-size', 'pack-objects'
> > will split out the resulting pack into individual pack(s) if necessary
> > to ensure that the written pack(s) are each no larger than the provided
> > threshold.
>
> Hm. So the result would be a new set of packfiles where each of them is
> smaller than the threshold, right?

Are you assuming there's only one threshold, or that --max-pack-size
== --max-cruft-size?

I read this assuming --max-pack-size >> --max-cruft-size, so the odds
that the N packs smaller than --max-cruft-size add up to more than
--max-pack-size is small -- but even if it does happen, it just
results in the cruft packs being split out into a couple packs.

> Wouldn't that mean that the next time
> we'll again do the same thing and try to combine the new set of cruft
> packs into one, and basically never arrive at a state where we don't
> touch the cruft packs anymore?

This would be a risk if we allow --max-cruft-size to approach or be
equal to --max-pack-size.  (And if --max-pack-size is less than
--max-cruft-size, then we'll perversely split into even more cruft
packs rather than combining as intended.)





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux