Re: [PATCH v4 4/6] pack-objects: generate cruft packs at most one object over threshold

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 13, 2025 at 5:16 AM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Elijah Newren <newren@xxxxxxxxx> writes:
>
> >> With below-size and max-size set to say 180 and 200 respectively, an
> >> attempt to combine the crufts may end up filling a cruft pack to 170
> >> but the smallest of the remaining cruft may weigh 40, which means
> >> including it would cause the max-size to be exceeded.  In such a
> >> scenario, there may not be a solution to satisfy given constraints,
> >> i.e. go above the below-size without stay below the max-size.
> >>
> >> So I am not sure if the approach would really solve much.
> >>
> >> Other than that a separate names, especially losing "max" from the
> >> threshold that really does not mean "max", would solve the confusion
> >> that comes from naming, that is.
> >
> > --max-pack-size is a constraint.  --combine-cruft-below-size is not.
> > Think particularly of the case where the user doesn't even have any
> > cruft packs yet and has only accumulated a little bit of cruft.  That
> > option is merely a guide post to say that if it's smaller than that
> > size, then feel free to keep trying to add to it (so long as it
> > doesn't violate constraints such as --max-pack-size).
>
> That is correct and it is why I said the suggestion solves the name
> confusion.  But think about the sample situation, before and after
> such a repack with two thresholds.  You had below- and max-size set
> to 180 and 200 respectively, and a cruft pack of size 170, and you
> failed to grow that cruft pack beyond 180 because the next available
> cruft weighed 40.  Then you'll repeat the exercise again, find 170
> that is smaller than the below- threshold, try to cram more and
> would fail.  Isn't that what Taylor's series wanted to prevent from
> happening, and isn't the two-threshod approach supposed to be a way
> to improve on it?

I don't think "always combine" is necessary for improvement.  Perhaps,
in your example, this round of repacking can't combine things.  But
the next time we want to repack cruft objects and there is anything
new that (individually or collectively) weighs between 10 and 30, we
can add it and get something over the lower threshold and then ignore
the resulting cruft pack it in the future.

In contrast, the single threshold either has to violate the maximum
constraint, or always reconsider everything.  The two threshold system
allows progress to be made (so long as it doesn't just look at the
first biggest object and fail every time), but particularly if you set
the thresholds too close to each other or you just have really large
cruft, then you _sometimes_ might not make progress.

Personally, I think I'd set the --combine-cruft-below-size to half of
--max-pack-size, because that guarantees that any two existing cruft
packs being considered for combining can be, and the resulting
combined cruft pack if big enough can then be ignored in the future.
In other words, this scheme would allow you to always make progress.

Am I missing something?





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux