[PATCH 00/19] midx: incremental multi-pack indexes, part one

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This series implements incremental MIDXs, which allow for storing
a MIDX across multiple layers, each with their own distinct set of
packs.

MOTIVATION
==========

Doing so allows large repositories to make use of the MIDX feature
without having to rewrite the entire MIDX every time they want to update
the set of packs contained in the MIDX. For extremely large
repositories, doing so is often infeasible.

OVERVIEW
========

This series implements the first component of incremental MIDXs, meaning
by the end of it you can run:

    $ git multi-pack-index write --incremental

a couple of times, and produce a directory structure like:

    $ .git/objects/pack/multi-pack-index.d
    .git/objects/pack/multi-pack-index.d
    ├── multi-pack-index-chain
    ├── multi-pack-index-baa53bc5092bed50378fe9232ae7878828df2890.midx
    └── multi-pack-index-f60023a8a104be94eab96dd7c42a6a5db67c82ba.midx

where each *.midx file behaves the same way as existing non-incremental
MIDX implementation behaves today, but in a way that stitches together
multiple MIDX "layers" without having to rewrite the whole MIDX anytime
you want to make a modification to it.

This is "part one" of a multi-part series. The overview of how all of
these series fit together is as follows:

  - "Part zero": preparatory work like 'tb/midx-write-cleanup' and my
    series to clean up temporary file handling [1, 2].

  - "Part one": this series, which enables reading and writing
    incremental MIDXs, but does not have support for more advanced
    features like bitmaps support or rewriting parts of the MIDX chain.

  - "Part two": the next series, which builds on support for multi-pack
    reachability bitmaps in an incremental MIDX world, meaning that each
    `*.midx` layer can have its own `*.bitmap`, and the bitmaps at each
    layer can be used together.

  - "Part three": which supports more advanced management of the MIDX
    chain, like compressing intermediate layers to avoid the chain
    growing too long.

Parts zero, one, and two all exist, and the first two have been shared
with the list. Part two exists in ttaylorr/git [3], but is excluded from
this series to keep the length manageable. I avoided sending this series
until I was confident that bitmaps worked on top of incremental MIDXs to
avoid designing ourselves into a corner.

Part three doesn't exist yet, but is straightforward to do on top. None
of the design decisions made in this series inhibit my goals for part
three.

[1]: https://lore.kernel.org/git/cover.1717023301.git.me@xxxxxxxxxxxx/
[2]: https://lore.kernel.org/git/cover.1717712358.git.me@xxxxxxxxxxxx/
[3]: https://github.com/ttaylorr/git/compare/tb/incremental-midx...ttaylorr:git:tb/incremental-midx-bitmaps

Taylor Blau (19):
  Documentation: describe incremental MIDX format
  midx: add new fields for incremental MIDX chains
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: implement verification support for incremental MIDXs
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  midx: implement support for writing incremental MIDX chains

 Documentation/git-multi-pack-index.txt       |  11 +-
 Documentation/technical/multi-pack-index.txt | 100 +++++
 builtin/multi-pack-index.c                   |   2 +
 builtin/repack.c                             |   8 +-
 ci/run-build-and-tests.sh                    |   2 +-
 midx-write.c                                 | 293 +++++++++++--
 midx.c                                       | 410 ++++++++++++++++---
 midx.h                                       |  26 +-
 object-name.c                                |  99 ++---
 packfile.c                                   |  21 +-
 packfile.h                                   |   4 +
 t/README                                     |   6 +-
 t/helper/test-read-midx.c                    |  24 +-
 t/lib-bitmap.sh                              |   6 +-
 t/lib-midx.sh                                |  28 ++
 t/t0410-partial-clone.sh                     |   2 -
 t/t5310-pack-bitmaps.sh                      |   4 -
 t/t5313-pack-bounds-checks.sh                |   8 +-
 t/t5319-multi-pack-index.sh                  |  30 +-
 t/t5326-multi-pack-bitmaps.sh                |   4 +-
 t/t5327-multi-pack-bitmaps-rev.sh            |   6 +-
 t/t5332-multi-pack-reuse.sh                  |   2 +
 t/t5334-incremental-multi-pack-index.sh      |  46 +++
 t/t7700-repack.sh                            |  48 +--
 24 files changed, 935 insertions(+), 255 deletions(-)
 create mode 100755 t/t5334-incremental-multi-pack-index.sh


base-commit: 680474691b4639280a73baa0bb8792634f99f611
-- 
2.45.2.437.gecb9450a0e




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux