[RFC PATCH 0/4] move pruned objects to a separate repository

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Now that cruft packs are available in v2.37.0, here is an interesting
application of that new feature to enable a two-phase object pruning
approach.

This came out of a discussion within GitHub about ways we could support
storing a set of pruned objects in "limbo" so that they were not
accessible from the repository which pruned them, but instead stored in
a cruft pack in a separate repository which lists the original one as an
alternate.

This makes it possible to take the collection of all pruned objects and
store them in a cruft pack in a separate repository. This repository
(which I have been referring to as the "expired.git") can then be used
as a donor repository for any missing objects (like the ones described
by the race in [1]).

The first few patches are preparatory. The final one implements writing
the pruned objects separately. The trick is to write another cruft pack
to a separate repository, with two tweaks:

  - the `--cruft-expiration` value is set to "never", since we want to
    keep around all of the objects we expired in the previous step, and

  - the original cruft pack appears as a pack that we are going to keep,
    meaning all unreachable objects that are stored in the original
    cruft pack are excluded from the one we write to the "expired.git"
    repository.

You can try this out yourself by doing something like:

    $ git init --bare ../expired.git $ git repack --cruft
    --cruft-expiration=1.day.ago -d \
    --expire-to=../expired.git/objects/pack/pack

which will create two cruft packs:

  - one in the repository which ran `git repack` containing all
    unreachable objects written within the last day, and
  - another in the "expired.git" repository which contains all
    unreachable objects written prior to the last day

This series is an RFC for now since I'm interested in discussing whether
or not this is a feature that people would actually want to use or not.
But if it is, I'm happy to polish this up and turn it into a
non-RFC-quality series ;-).

In the meantime, thanks for your review!

[1]: https://lore.kernel.org/git/YryF+vkosJOXf+mQ@nand.local/

Taylor Blau (4):
  builtin/repack.c: pass "out" to `prepare_pack_objects`
  builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack`
  builtin/repack.c: write cruft packs to arbitrary locations
  builtin/repack.c: implement `--expire-to` for storing pruned objects

 Documentation/git-repack.txt |   6 ++
 builtin/repack.c             |  67 ++++++++++++++++---
 t/t7700-repack.sh            | 121 +++++++++++++++++++++++++++++++++++
 3 files changed, 186 insertions(+), 8 deletions(-)

-- 
2.37.0.1.g1379af2e9d



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux