[PATCH v2 0/2] midx: apply gitconfig to midx repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Midx repack has largely been used in Microsoft Scalar on the client side to
optimize the repository multiple packs state. However when I tried to apply
this onto the server-side, I realized that there are certain features that
were lacking compare to git repack. Most of these features are highly
desirable on the server-side to create the most optimized pack possible.

One of the example is delta_base_offset, comparing an midx repack
with/without delta_base_offset, we can observe significant size differences.

> du objects/pack/*pack
14536   objects/pack/pack-08a017b424534c88191addda1aa5dd6f24bf7a29.pack
9435280 objects/pack/pack-8829c53ad1dca02e7311f8e5b404962ab242e8f1.pack

Latest 2.26.2 (without delta_base_offset)
> git multi-pack-index write
> git multi-pack-index repack
> git multi-pack-index expire
> du objects/pack/*pack
9446096 objects/pack/pack-366c75e2c2f987b9836d3bf0bf5e4a54b6975036.pack

With delta_base_offset
> git version
git version 2.26.2.672.g232c24e857.dirty
> git multi-pack-index write
> git multi-pack-index repack
> git multi-pack-index expire
> du objects/pack/*pack
9152512 objects/pack/pack-3bc8c1ec496ab95d26875f8367ff6807081e9e7d.pack

In this patch, I intentionally leaving out repack.writeBitmaps as I see that
it might need some update on pack-objects to improve the performance

Derrick Stolee following patch with address repack. packKeptObjects support.

Derrick Stolee (1):
  multi-pack-index: respect repack.packKeptObjects=false

Son Luong Ngoc (1):
  midx: apply gitconfig to midx repack

 Documentation/git-multi-pack-index.txt |  3 +++
 midx.c                                 | 36 ++++++++++++++++++++++----
 t/t5319-multi-pack-index.sh            | 26 +++++++++++++++++++
 3 files changed, 60 insertions(+), 5 deletions(-)


base-commit: b34789c0b0d3b137f0bb516b417bd8d75e0cb306
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-626%2Fsluongng%2Fsluongngoc%2Fmidx-config-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-626/sluongng/sluongngoc/midx-config-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/626

Range-diff vs v1:

 1:  215c882a503 ! 1:  21c648cc486 midx: apply gitconfig to midx repack
     @@ Commit message
          In this patch, I applies those flags into `git multi-pack-index repack`
          so that it respect the `repack.*` config series.
      
     -    Note: I left out `repack.packKeptObjects` intentionally as I dont think
     -    its relevant to midx repack use case.
     +    Note:
     +    - `repack.packKeptObjects` will be addressed by Derrick Stolee in
     +    the following patch
     +    - `repack.writeBitmaps` when `--batch-size=0` was NOT adopted here as it
     +    requires `--all` to be passed onto `git pack-objects`, which is very
     +    slow. I think it would be nice to have this in a future patch.
      
          Signed-off-by: Son Luong Ngoc <sluongng@xxxxxxxxx>
      
       ## midx.c ##
     -@@ midx.c: static int fill_included_packs_batch(struct repository *r,
     - 	return 0;
     - }
     +@@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
     + 	struct child_process cmd = CHILD_PROCESS_INIT;
     + 	struct strbuf base_name = STRBUF_INIT;
     + 	struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
     ++	int delta_base_offset = 1;
     ++	int use_delta_islands;
       
     -+static int delta_base_offset = 1;
     -+static int write_bitmaps = -1;
     -+static int use_delta_islands;
     -+
     - int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags)
     - {
     - 	int result = 0;
     + 	if (!m)
     + 		return 0;
      @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
       	} else if (fill_included_packs_all(m, include_pack))
       		goto cleanup;
       
     -+  git_config_get_bool("repack.usedeltabaseoffset", &delta_base_offset);
     -+  git_config_get_bool("repack.writebitmaps", &write_bitmaps);
     -+  git_config_get_bool("repack.usedeltaislands", &use_delta_islands);
     ++	repo_config_get_bool(r, "repack.usedeltabaseoffset", &delta_base_offset);
     ++	repo_config_get_bool(r, "repack.usedeltaislands", &use_delta_islands);
      +
       	argv_array_push(&cmd.args, "pack-objects");
       
     @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t bat
      +		argv_array_push(&cmd.args, "--delta-base-offset");
      +	if (use_delta_islands)
      +		argv_array_push(&cmd.args, "--delta-islands");
     -+	if (write_bitmaps > 0)
     -+		argv_array_push(&cmd.args, "--write-bitmap-index");
     -+	else if (write_bitmaps < 0)
     -+		argv_array_push(&cmd.args, "--write-bitmap-index-quiet");
      +
       	if (flags & MIDX_PROGRESS)
       		argv_array_push(&cmd.args, "--progress");
 -:  ----------- > 2:  3d7b334f5c6 multi-pack-index: respect repack.packKeptObjects=false

-- 
gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux