Re: 1.3.0 creating bigger packs than 1.2.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:

> v1.2.3 hash was base-name only.  doc/Makefile and t/Makefile
> were thrown in the same bin and sorted by size.  When the
> history you are packing is deep, and doc/Makefile and t/Makefile
> are not related at all, this made effective size of delta window
> 1/N where N is the number of such duplicates.
> 
> The one you found above uses a hash that is fully full-path.
> The two are in completely different bins, and bins are totally
> random.  This was not a good strategy.
> 
> v1.3.0 hash is base-name hash concatenated with leading-path
> has.  t/Makefile and doc/Makefile go in separate bins, but the
> bins are close to each other; this avoids the problem in v1.2.3
> when you have deep history, but at the same time if you do not
> have many many versions of t/Makefile to overflow the delta
> window, it gives t/Makefile a chance to delta with doc/Makefile.
[...]
> You could try this patch to resurrect the hash used in v1.2.3,
> and you may get better packing for your particular repository;
> but I am not sure if it gives better results in the general
> case.  I am running the test myself now while waiting for my
> day-job database to load X-<.

Perhaps the packing code could check which version gives smaller pack, or at
least be instructed that one might want different packing heuristic for
specific repository? Surely 2x difference in size is worth considering (and
complication)...

-- 
Jakub Narebski
Warsaw, Poland

-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]