On Sun, Dec 22, 2019 at 10:26:20AM +0100, Christian Couder wrote: > I have a question though. Are the performance gains only available > with `git log -- path` or are they already available for example when > doing a partial clone and/or a sparse checkout? >From my quick look at the code, anything that feeds a pathspec to a revision traversal would be helped. I'm not sure if it would help for partial/sparse traversals, though. There we actually need to know which blobs correspond to the paths in question, not just whether any particular commit touched them. I also took a brief look at adding support to the custom blame-tree implementation we use at GitHub, and got about a 6x speedup. > > This series is intended to start the conversation and many of the commit > > messages include specific call outs for suggestions and thoughts. > > I think Peff said during the Virtual Contributor Summit that he was > interested in using bitmaps to speed up partial clone on the server > side. Would it make sense to use both bitmaps and bloom filters? I think they're orthogonal. For size-based filters on blobs, you'd just use bitmaps as normal, because you can post-process the result to check the type and size of each object in the list (and I have patches to do this, but they need some polishing and we're not yet running them). For path-based filters like a sparse specification, you can't use bitmaps at all; you have to do a real traversal. But there you still generally get all of the commits. I guess if a commit doesn't touch any path you're interested in, you could avoid walking into its tree at all, which might help. I haven't given it much thought yet. -Peff