Hi, On Fri, Dec 20, 2019 at 11:07 PM Garima Singh via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > The commit graph feature brought in a lot of performance improvements across > multiple commands. However, file based history continues to be a performance > pain point, especially in large repositories. > > Adopting changed path bloom filters has been discussed on the list before, > and a prototype version was worked on by SZEDER Gábor, Jonathan Tan and Dr. > Derrick Stolee [1]. This series is based on Dr. Stolee's approach [2] and > presents an updated and more polished RFC version of the feature. Thanks for working on this! > Performance Gains: We tested the performance of git log -- path on the git > repo, the linux repo and some internal large repos, with a variety of paths > of varying depths. > > On the git and linux repos: We observed a 2x to 5x speed up. > > On a large internal repo with files seated 6-10 levels deep in the tree: We > observed 10x to 20x speed ups, with some paths going up to 28 times faster. Very nice! I have a question though. Are the performance gains only available with `git log -- path` or are they already available for example when doing a partial clone and/or a sparse checkout? > Future Work (not included in the scope of this series): > > 1. Supporting multiple path based revision walk > 2. Adopting it in git blame logic. > 3. Interactions with line log git log -L Great! > This series is intended to start the conversation and many of the commit > messages include specific call outs for suggestions and thoughts. I think Peff said during the Virtual Contributor Summit that he was interested in using bitmaps to speed up partial clone on the server side. Would it make sense to use both bitmaps and bloom filters? Thanks, Christian.