On Tue, Oct 09, 2018 at 05:06:20PM -0400, Jeff King wrote: > On Tue, Oct 09, 2018 at 09:34:43PM +0200, SZEDER Gábor wrote: > > > Creating the Bloom filter is sloooow. Running it on git.git takes > > about 23s on my hardware, while > > > > git log --format='%H%n%P' --name-only --all >/dev/null > > > > gathers all the information necessary for that in about 5.3s. > > That command won't open the trees for merges at all. But your > implementation here looks like it does a diff against each parent of a > merge. Yeah, it does so, because that is what try_to_simplify_commit() / rev_compare_tree() will do while traversing the history. > Adding "-m" would be a more accurate comparison, I think. > > Though I find that puzzling, because "-m --name-only" seems to take > about 20x longer, not 3x. So perhaps I'm missing something. Ugh, indeed.