"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: [snip] > DESCRIPTION > ----------- > @@ -249,6 +251,36 @@ linkgit:git-multi-pack-index[1]). > Write a multi-pack index (see linkgit:git-multi-pack-index[1]) > containing the non-redundant packs. > > +--name-hash-version=<n>:: > + While performing delta compression, Git groups objects that may be > + similar based on heuristics using the path to that object. While > + grouping objects by an exact path match is good for paths with > + many versions, there are benefits for finding delta pairs across > + different full paths. Git collects objects by type and then by a > + "name hash" of the path and then by size, hoping to group objects > + that will compress well together. > ++ > +The default name hash version is `1`, which prioritizes hash locality by > +considering the final bytes of the path as providing the maximum magnitude > +to the hash function. This version excels at distinguishing short paths > +and finding renames across directories. However, the hash function depends > +primarily on the final 16 bytes of the path. If there are many paths in > +the repo that have the same final 16 bytes and differ only by parent > +directory, then this name-hash may lead to too many collisions and cause > +poor results. At the moment, this version is required when writing > +reachability bitmap files with `--write-bitmap-index`. > ++ > +The name hash version `2` has similar locality features as version `1`, > +except it considers each path component separately and overlays the hashes > +with a shift. This still prioritizes the final bytes of the path, but also > +"salts" the lower bits of the hash using the parent directory names. This > +method allows for some of the locality benefits of version `1` while > +breaking most of the collisions from a similarly-named file appearing in > +many different directories. At the moment, this version is not allowed > +when writing reachability bitmap files with `--write-bitmap-index` and it > +will be automatically changed to version `1`. > + > + Nit: I wonder if it'd be nicer to simply point to the documentation in 'Documentation/git-pack-objects.txt'. This would ensure we have consistent documentation and a single source of truth. > > CONFIGURATION > ------------- > > diff --git a/builtin/repack.c b/builtin/repack.c > index 05e13adb87f..5e7ff919c1a 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -39,7 +39,9 @@ static int run_update_server_info = 1; > static char *packdir, *packtmp_name, *packtmp; > > static const char *const git_repack_usage[] = { > - N_("git repack [<options>]"), > + N_("git repack [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]\n" > + "[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]\n" > + "[--write-midx] [--name-hash-version=<n>]"), > NULL > }; So this fixes the mismatch in t0450 which is seen below. Nit: might be worthwhile adding this in the commit message. [snip]
Attachment:
signature.asc
Description: PGP signature