On Mon, May 06, 2024 at 04:22:11PM -0700, Junio C Hamano wrote: > "Robin H. Johnson" <robbat2@xxxxxxxxxx> writes: > > > Gentoo has some tooling that boils down to repeated runs of 'git log -- somepath/' > > via cgit as well as other shell tooling. > > ... > > I was wondering if Git could gain a secondary index of commits, based on > > path prefixes, that would speed up the 'git log' run. > > Perhaps the bloom filters are good fit for the use case? Yes, Bloom filters are the first thing that pop into my mind here as they are exactly designed to solve this problem. So if you rewrite your commit graphs with `git commit-graph write --changed-paths --reachable` you should hopefully see a significant speedup. It does surface some a usability issues though: - There is no easy way to enable the computation of bloom filters via configuration, to the best of my knowledge. - How would a non-Git-expert know? It makes me wonder whether we can maybe enable generation of Bloom filters by default. The biggest downside is of course that writing commit graphs becomes slower. But that should happen in the background for normal users anyway, and most forges probably hand-roll maintenance and thus wouldn't care. Is there any other thing I'm missing why those are not written by default? Patrick
Attachment:
signature.asc
Description: PGP signature