On Wed, Mar 19, 2025 at 3:52 PM Taylor Blau <me@xxxxxxxxxxxx> wrote: > > Here is a small reroll of my series to introduce a new 'repack' flag > called '--combine-cruft-below-size'. There aren't any functional changes > in this round, and the changes that do exist are limited to > documentation and commit message tweaks in the final patch. > > A range-diff is included below, as well as the original cover letter: > > (This is based on tb/multi-cruft-pack-refresh-fix from Junio's tree, > which is at 08f612ba70 (builtin/pack-objects.c: freshen objects from > existing cruft packs, 2025-03-13) at the time of writing). > > This series replaces something close to the existing behavior of > repack's '--max-cruft-size' flag with '--combine-cruft-below-size'. > > The new flag is much clearer in its intent and function, and avoids the > lack of clarity between the two that was discussed in > > <cover.1741648467.git.me@xxxxxxxxxxxx> > > The new behavior is as follows: > > - '--max-cruft-size' is a cruft pack-specific override for repack's > '--max-pack-size' command-line flag. > > - '--combine-cruft-below-size' instructs repack to only combine cruft > packs which are smaller than the given threshold. This will likely > result in packs which are larger than the threshold. But that is OK: > the point is to limit the size of the individual packs on input, not > the size of the outgoing pack. > > This series does break the existing behavior of '--max-cruft-size'. But > I think breaking backwards compatibility here is OK, since the existing > behavior was so broken to begin with. I'm happy to consider other > alternatives if others feel that this isn't OK. > > The series has an interesting structure that I feel may be worth calling > out. The first three patches are trivial test cleanups. The fourth patch > modifies the existing behavior of '--max-cruft-size', but does so while > keeping some of the old infrastructure around. > > That may seem like an unnecessarily complicated approach, but it greatly > simplifies introducing the new behavior in the following (and final) > commit. I think that this results in a series that is a little easier to > review (since we don't see a ton of code being removed in one commit and > then re-added in another immediately following it). But if others feel > like this was a mistake, please let me know ;-). > > Thanks in advance for your review! > > Taylor Blau (5): > t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests > t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests > t/t7704-repack-cruft.sh: consolidate `write_blob()` > repack: avoid combining cruft packs with `--max-cruft-size` > repack: begin combining cruft packs with `--combine-cruft-below-size` > > Documentation/git-repack.adoc | 21 ++- > builtin/repack.c | 62 +++---- > t/t5329-pack-objects-cruft.sh | 302 ++++++---------------------------- > t/t7704-repack-cruft.sh | 293 ++++++++++++++++++++++++++++++--- > 4 files changed, 355 insertions(+), 323 deletions(-) > > Range-diff against v1: > 1: 0aa8aa65c1 = 1: 0aa8aa65c1 t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests > 2: 5e8bd3e66e = 2: 5e8bd3e66e t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests > 3: b075ad8601 = 3: b075ad8601 t/t7704-repack-cruft.sh: consolidate `write_blob()` > 4: 7941997e33 = 4: 7941997e33 repack: avoid combining cruft packs with `--max-cruft-size` > 5: 7f120c35e9 ! 5: dee780a2aa repack: begin combining cruft packs with `--combine-cruft-below-size` > @@ Commit message > Introduce a new flag, '--combine-cruft-below-size' which is a > replacement for the old behavior of '--max-cruft-size'. This new flag > does explicitly what it says: it combines together cruft packs which are > - smaller than a given threshold, and prohibits repacking ones which are > + smaller than a given threshold, and leaves alone ones which are > larger. > > This accomplishes the original intent of '--max-cruft-size', which was > @@ Commit message > But that's OK: the point isn't to restrict the size of the cruft packs > we generate, it's to avoid working with ones that have already grown too > large. If repositories still want to limit the size of the generated > - cruft pack(s), they may use '--max-cruft-size' instead. > + cruft pack(s), they may use '--max-cruft-size'. > > There's some minor test fallout as a result of the slight differences in > behavior between the old meaning of '--max-cruft-size' and the behavior > @@ Documentation/git-repack.adoc: to the new separate pack will be written. > > +--combine-cruft-below-size=<n>:: > + When generating cruft packs without pruning, only repack > -+ existing cruft packs whose size is strictly less than `<n>`. > -+ Cruft packs whose size is greater than or equal to `<n>` are > -+ left as-is and not repacked. Useful when you want to avoid > -+ repacking large cruft pack(s) in repositories that have many > -+ and/or large unreachable objects. > ++ existing cruft packs whose size is strictly less than `<n>`, > ++ where `<n>` represents a number of bytes, which can optionally > ++ be suffixed with "k", "m", or "g". Cruft packs whose size is > ++ greater than or equal to `<n>` are left as-is and not repacked. > ++ Useful when you want to avoid repacking large cruft pack(s) in > ++ repositories that have many and/or large unreachable objects. > + > --expire-to=<dir>:: > Write a cruft pack containing pruned objects (if any) to the > > base-commit: 08f612ba7000bf181ef6d8baed9ece322e567efd > -- > 2.49.0.4.ge59cf92f8d v2 + your comments address all my feedback from v1, so this round looks good to me.