On Tue, May 07, 2024 at 06:25:08AM +0200, Patrick Steinhardt wrote: > On Mon, May 06, 2024 at 04:22:11PM -0700, Junio C Hamano wrote: > > "Robin H. Johnson" <robbat2@xxxxxxxxxx> writes: > > > > > Gentoo has some tooling that boils down to repeated runs of 'git log -- somepath/' > > > via cgit as well as other shell tooling. > > > ... > > > I was wondering if Git could gain a secondary index of commits, based on > > > path prefixes, that would speed up the 'git log' run. > > > > Perhaps the bloom filters are good fit for the use case? > > Yes, Bloom filters are the first thing that pop into my mind here as > they are exactly designed to solve this problem. So if you rewrite your > commit graphs with `git commit-graph write --changed-paths --reachable` > you should hopefully see a significant speedup. Good news & bad news. "git log -- sys-apps/pv >/dev/null" as my testcase from before: The fast system (2.45.0) went from 11 seconds to ~1 second! The slow system (2.44.0) went from 45 seconds to 49 seconds :-(. I'll try to trace down why one system slowed down. commit-graph command: fast: 1m10s slow: 3m43s > It makes me wonder whether we can maybe enable generation of Bloom > filters by default. The biggest downside is of course that writing > commit graphs becomes slower. But that should happen in the background > for normal users anyway, and most forges probably hand-roll maintenance > and thus wouldn't care. Most repos are also MUCH smaller than this, so it should be safe to enable. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer E-Mail : robbat2@xxxxxxxxxx GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
Attachment:
signature.asc
Description: PGP signature