Re: [PATCH v3 6/8] maintenance: add incremental-repack task

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> +incremental-repack::
> +	The `incremental-repack` job repacks the object directory
> +	using the `multi-pack-index` feature. In order to prevent race
> +	conditions with concurrent Git commands, it follows a two-step
> +	process.

[snip]

> First, it deletes any pack-files included in the
> +	`multi-pack-index` where none of the objects in the
> +	`multi-pack-index` reference those pack-files; this only happens
> +	if all objects in the pack-file are also stored in a newer
> +	pack-file. Second, it selects a group of pack-files whose "expected
> +	size" is below the batch size until the group has total expected
> +	size at least the batch size; see the `--batch-size` option for
> +	the `repack` subcommand in linkgit:git-multi-pack-index[1]. The
> +	default batch-size is zero, which is a special case that attempts
> +	to repack all pack-files into a single pack-file.

This lacks the detail of what happens to the selected group of packfiles
(in the second step) - in particular, that a new packfile is created and
the MIDX is rewritten so that all references to the selected group are
updated to refer to the new packfile instead, thus making it possible to
delete the selected group of packfiles in a subsequent first step. All
this is explained in the documentation of git-multi-pack-index (expire
and repack), though, so it might be better to refer to that. E.g.

  First, it calls `multi-pack-index expire` to delete packfiles
  unreferenced by the MIDX file. Second, it calls `multi-pack-index
  repack` to select several small packfiles and repack them into a
  bigger one, and then update the MIDX entries that refer to the small
  packfiles to refer to the big one instead, thus preparing it for
  deletion upon a subsequent `multi-pack-index expire` invocation. The
  selection of the small packfiles is such that the expected size of the
  big packfile is at least the batch size; see the ...

> diff --git a/midx.c b/midx.c
> index aa37d5da86..66d7053d83 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -37,7 +37,7 @@
>  
>  #define PACK_EXPIRED UINT_MAX
>  
> -static char *get_midx_filename(const char *object_dir)
> +char *get_midx_filename(const char *object_dir)
>  {
>  	return xstrfmt("%s/pack/multi-pack-index", object_dir);
>  }
> diff --git a/midx.h b/midx.h
> index b18cf53bc4..baeecc70c9 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -37,6 +37,7 @@ struct multi_pack_index {
>  
>  #define MIDX_PROGRESS     (1 << 0)
>  
> +char *get_midx_filename(const char *object_dir);
>  struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
>  int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
>  int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);

Do we need get_midx_filename() to be global?



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux