On 12/28/2020 6:15 AM, Abhishek Kumar via GitGitGadget wrote: > From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> > > Before computing Bloom fitlers, the commit-graph machinery uses s/fitlers/filters/ > commit_gen_cmp to sort commits by generation order for improved diff > performance. 3d11275505 (commit-graph: examine commits by generation > number, 2020-03-30) claims that this sort can reduce the time spent to > compute Bloom filters by nearly half. > > But since c49c82aa4c (commit: move members graph_pos, generation to a > slab, 2020-06-17), this optimization is broken, since asking for a > 'commit_graph_generation()' directly returns GENERATION_NUMBER_INFINITY > while writing. > > Not all hope is lost, though: 'commit_graph_generation()' falls back to > comparing commits by their date when they have equal generation number, > and so since c49c82aa4c is purely a date comparision function. This s/comparision/comparison/ > heuristic is good enough that we don't seem to loose appreciable > performance while computing Bloom filters. Applying this patch (compared > with v2.29.1) speeds up computing Bloom filters by around ~4 > seconds. Using "~4 seconds" here is odd since there is no baseline. Which repository did you use? Previous discussion used relative terms. Something like "speeds up by a factor of 1.25" or something might be interesting. > So, avoid the useless 'commit_graph_generation()' while writing by > instead accessing the slab directly. This returns the newly-computed > generation numbers, and allows us to avoid the heuristic by directly > comparing generation numbers. This introduces some timing restrictions to the ability for this comparison function. It would be dangerous if someone extracted the method for another purpose. A comment above these lines could warn future developers from making that mistake, but they would probably use the comparison functions in commit.c instead. > Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> > --- > commit-graph.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/commit-graph.c b/commit-graph.c > index 06f8dc1d896..caf823295f4 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -144,8 +144,8 @@ static int commit_gen_cmp(const void *va, const void *vb) > const struct commit *a = *(const struct commit **)va; > const struct commit *b = *(const struct commit **)vb; > > - uint32_t generation_a = commit_graph_generation(a); > - uint32_t generation_b = commit_graph_generation(b); > + uint32_t generation_a = commit_graph_data_at(a)->generation; > + uint32_t generation_b = commit_graph_data_at(b)->generation; > /* lower generation commits first */ > if (generation_a < generation_b) > return -1; >