Re: [PATCH 0/4] pack-objects: create new name-hash algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Derrick Stolee <stolee@xxxxxxxxx> writes:
>
>> The thing that surprised me is just how effective this is for the
>> creation of large pack-files that include many versions of most
>> files. The cross-path deltas have less of an effect here, and the
>> benefits of avoiding name-hash collisions can be overwhelming in
>> many cases.
>
> Yes, "make sure we notice a file F moving from directory A to B" is
> inherently optimized for short span of history, i.e. a smallish push
> rather than a whole history clone, where the definition of
> "smallish" is that even if you create optimal delta chains, the
> length of these delta chains will not exceed the "--depth" option.
>
> If the history you are pushing modified A/F twice, renamed it to B/F
> (with or without modification at the same time), then modified B/F
> twice more, you'd want to pack the 5-commit segment and having to
> artificially cut the delta chain that can contain all of these 5
> blobs into two at the renaming commit is a huge loss.

Which actually leads me to suspect that we probably do not even have
to expose the --full-name-hash option to the end users in "git repack".

If we are doing incremental that would fit within the depth setting,
it is likely that we would be better off without the full-name-hash
optimization, and if we are doing "repack -a" for the whole
repository, especially with "-f", it would make sense to do the
full-name-hash optimization.

If we can tell how large a chunk of history we are packing before we
actually start calling builtin/pack-objects.c:add_object_entry(), we
probably should be able to even select between with and without
full-name-hash automatically, but I do not think we know the object
count before we finish calling add_object_entry(), so unless we are
willing to compute and keep both while reading and pick between the
two after we finish reading the list of objects, or something, it
will require a major surgery to do so, I am afraid.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux