On Thu, Mar 20, 2014 at 5:11 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > David Kastrup <dak@xxxxxxx> writes: > >> Junio C Hamano <gitster@xxxxxxxxx> writes: >> >>> David Kastrup <dak@xxxxxxx> writes: >>> >>>> The default of 16MiB causes serious thrashing for large delta chains >>>> combined with large files. >>>> >>>> Signed-off-by: David Kastrup <dak@xxxxxxx> >>>> --- >>> >>> Is that a good argument? Wouldn't the default of 128MiB burden >>> smaller machines with bloated processes? >> >> The default file size before Git forgets about delta compression is >> 512MiB. Unpacking 500MiB files with 16MiB of delta storage is going to >> be uglier. >> >> ... >> >> Documentation/config.txt states: >> >> core.deltaBaseCacheLimit:: >> Maximum number of bytes to reserve for caching base objects >> that may be referenced by multiple deltified objects. By storing the >> entire decompressed base objects in a cache Git is able >> to avoid unpacking and decompressing frequently used base >> objects multiple times. >> + >> Default is 16 MiB on all platforms. This should be reasonable >> for all users/operating systems, except on the largest projects. >> You probably do not need to adjust this value. >> >> I've seen this seriously screwing performance in several projects of >> mine that don't really count as "largest projects". >> >> So the description in combination with the current setting is clearly wrong. > > That is a good material for proposed log message, and I think you > are onto something here. > > I know that the 512MiB default for the bitFileThreashold (aka > "forget about delta compression") came out of thin air. It was just > "1GB is always too huge for anybody, so let's cut it in half and > declare that value the initial version of a sane threashold", > nothing more. > > So it might be that the problem is 512MiB is still too big, relative > to the 16MiB of delta base cache, and the former may be what needs > to be tweaked. If a blob close to but below 512MiB is a problem for > 16MiB delta base cache, it would still be too big to cause the same > problem for 128MiB delta base cache---it would evict all the other > objects and then end up not being able to fit in the limit itself, > busting the limit immediately, no? > > I would understand if the change were to update the definition of > deltaBaseCacheLimit and link it to the value of bigFileThreashold, > for example. With the presented discussion, I am still not sure if > we can say that bumping deltaBaseCacheLimit is the right solution to > the "description with the current setting is clearly wrong" (which > is a real issue). I vote make big_file_threshold smaller. 512MB is already unfriendly for many smaller machines. I'm thinking somewhere around 32MB-64MB (and maybe increase delta cache base limit to match). The only downside I see is large blobs will be packed undeltified, which could increase pack size if you have lots of them. But maybe we could improve pack-objects/repack/gc to deltify large blobs anyway if they're old enough. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html