[PATCH v2 0/5] repack: introduce '--combine-cruft-below-size'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here is a small reroll of my series to introduce a new 'repack' flag
called '--combine-cruft-below-size'. There aren't any functional changes
in this round, and the changes that do exist are limited to
documentation and commit message tweaks in the final patch.

A range-diff is included below, as well as the original cover letter:

(This is based on tb/multi-cruft-pack-refresh-fix from Junio's tree,
which is at 08f612ba70 (builtin/pack-objects.c: freshen objects from
existing cruft packs, 2025-03-13) at the time of writing).

This series replaces something close to the existing behavior of
repack's '--max-cruft-size' flag with '--combine-cruft-below-size'.

The new flag is much clearer in its intent and function, and avoids the
lack of clarity between the two that was discussed in

  <cover.1741648467.git.me@xxxxxxxxxxxx>

The new behavior is as follows:

  - '--max-cruft-size' is a cruft pack-specific override for repack's
    '--max-pack-size' command-line flag.

  - '--combine-cruft-below-size' instructs repack to only combine cruft
    packs which are smaller than the given threshold. This will likely
    result in packs which are larger than the threshold. But that is OK:
    the point is to limit the size of the individual packs on input, not
    the size of the outgoing pack.

This series does break the existing behavior of '--max-cruft-size'. But
I think breaking backwards compatibility here is OK, since the existing
behavior was so broken to begin with. I'm happy to consider other
alternatives if others feel that this isn't OK.

The series has an interesting structure that I feel may be worth calling
out. The first three patches are trivial test cleanups. The fourth patch
modifies the existing behavior of '--max-cruft-size', but does so while
keeping some of the old infrastructure around.

That may seem like an unnecessarily complicated approach, but it greatly
simplifies introducing the new behavior in the following (and final)
commit. I think that this results in a series that is a little easier to
review (since we don't see a ton of code being removed in one commit and
then re-added in another immediately following it). But if others feel
like this was a mistake, please let me know ;-).

Thanks in advance for your review!

Taylor Blau (5):
  t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests
  t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
  t/t7704-repack-cruft.sh: consolidate `write_blob()`
  repack: avoid combining cruft packs with `--max-cruft-size`
  repack: begin combining cruft packs with `--combine-cruft-below-size`

 Documentation/git-repack.adoc |  21 ++-
 builtin/repack.c              |  62 +++----
 t/t5329-pack-objects-cruft.sh | 302 ++++++----------------------------
 t/t7704-repack-cruft.sh       | 293 ++++++++++++++++++++++++++++++---
 4 files changed, 355 insertions(+), 323 deletions(-)

Range-diff against v1:
1:  0aa8aa65c1 = 1:  0aa8aa65c1 t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests
2:  5e8bd3e66e = 2:  5e8bd3e66e t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
3:  b075ad8601 = 3:  b075ad8601 t/t7704-repack-cruft.sh: consolidate `write_blob()`
4:  7941997e33 = 4:  7941997e33 repack: avoid combining cruft packs with `--max-cruft-size`
5:  7f120c35e9 ! 5:  dee780a2aa repack: begin combining cruft packs with `--combine-cruft-below-size`
    @@ Commit message
         Introduce a new flag, '--combine-cruft-below-size' which is a
         replacement for the old behavior of '--max-cruft-size'. This new flag
         does explicitly what it says: it combines together cruft packs which are
    -    smaller than a given threshold, and prohibits repacking ones which are
    +    smaller than a given threshold, and leaves alone ones which are
         larger.

         This accomplishes the original intent of '--max-cruft-size', which was
    @@ Commit message
         But that's OK: the point isn't to restrict the size of the cruft packs
         we generate, it's to avoid working with ones that have already grown too
         large. If repositories still want to limit the size of the generated
    -    cruft pack(s), they may use '--max-cruft-size' instead.
    +    cruft pack(s), they may use '--max-cruft-size'.

         There's some minor test fallout as a result of the slight differences in
         behavior between the old meaning of '--max-cruft-size' and the behavior
    @@ Documentation/git-repack.adoc: to the new separate pack will be written.

     +--combine-cruft-below-size=<n>::
     +	When generating cruft packs without pruning, only repack
    -+	existing cruft packs whose size is strictly less than `<n>`.
    -+	Cruft packs whose size is greater than or equal to `<n>` are
    -+	left as-is and not repacked. Useful when you want to avoid
    -+	repacking large cruft pack(s) in repositories that have many
    -+	and/or large unreachable objects.
    ++	existing cruft packs whose size is strictly less than `<n>`,
    ++	where `<n>` represents a number of bytes, which can optionally
    ++	be suffixed with "k", "m", or "g". Cruft packs whose size is
    ++	greater than or equal to `<n>` are left as-is and not repacked.
    ++	Useful when you want to avoid repacking large cruft pack(s) in
    ++	repositories that have many and/or large unreachable objects.
     +
      --expire-to=<dir>::
      	Write a cruft pack containing pruned objects (if any) to the

base-commit: 08f612ba7000bf181ef6d8baed9ece322e567efd
--
2.49.0.4.ge59cf92f8d




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux