Duy Nguyen <pclouds@xxxxxxxxx> writes: > On Mon, May 5, 2014 at 12:13 AM, David Kastrup <dak@xxxxxxx> wrote: >> The default of 16m causes serious thrashing for large delta chains >> combined with large files. >> >> Here are some benchmarks (pu variant of git blame): >> >> time git blame -C src/xdisp.c >/dev/null > > ... > >> diff --git a/Documentation/config.txt b/Documentation/config.txt >> index 1932e9b..21a3c86 100644 >> --- a/Documentation/config.txt >> +++ b/Documentation/config.txt >> @@ -489,7 +489,7 @@ core.deltaBaseCacheLimit:: >> to avoid unpacking and decompressing frequently used base >> objects multiple times. >> + >> -Default is 16 MiB on all platforms. This should be reasonable >> +Default is 96 MiB on all platforms. This should be reasonable >> for all users/operating systems, except on the largest projects. >> You probably do not need to adjust this value. > > So emacs.git falls exactly into the "except on the largest projects" > part. git gc --aggressive has been used/recommended for _all_ projects regularly, leading to delta chains with a length of 250. So this delta chain size is not exceptional but will eventually occur in any archive that has been created and maintained according to the recommendations of Git's documentation (which recommends gc --aggressive every few hundreds of revisions). I was illustrating the effect on a file of size 1MB. That's not an egregiously large file either. 96MB is the point of diminuishing returns for this case which is _6_ times larger than the current default and _small_ in comparison with the memory installed on developer machines nowadays. Similar slowdowns occur with other examples. Git will with the current defaults accept files of 512Mb size into its compression scheme (and thus its core memory) before punting. The current delteBaseCacheLimit of 16Mb is rather ridiculous in particular with the pre-2.0 settings for gc --aggressive and causes serious performance degration. It was actually ridiculous even 10 years ago. > Would it make more sense to advise git devs to set this per repo > instead? The majority of (open source) repositories out there are > small if I'm not mistaken. Of those few big repos, we could have a > section listing all the tips and tricks to tune git. This is one of > them. Index v4 and sparse checkout are some other. In future, maybe > watchman support, split index and untracked cache as well. Shrug. The last version of the patch was refused because of wanting more evidence. I added the evidence. And I have it on record in the mailing list and can point to it when people ask me why Git is so slow for "git blame" in comparison to other version control systems in spite of my purporting to having improved it. I'm definitely not going to jump through any more hoops here. I don't see a point in this kind of spectacle. -- David Kastrup -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html