Re: [PATCH v5 01/11] commit-graph: fix regression when computing Bloom filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/28/2020 6:15 AM, Abhishek Kumar via GitGitGadget wrote:
> From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
> 
> Before computing Bloom fitlers, the commit-graph machinery uses

s/fitlers/filters/

> commit_gen_cmp to sort commits by generation order for improved diff
> performance. 3d11275505 (commit-graph: examine commits by generation
> number, 2020-03-30) claims that this sort can reduce the time spent to
> compute Bloom filters by nearly half.
> 
> But since c49c82aa4c (commit: move members graph_pos, generation to a
> slab, 2020-06-17), this optimization is broken, since asking for a
> 'commit_graph_generation()' directly returns GENERATION_NUMBER_INFINITY
> while writing.
> 
> Not all hope is lost, though: 'commit_graph_generation()' falls back to
> comparing commits by their date when they have equal generation number,
> and so since c49c82aa4c is purely a date comparision function. This

s/comparision/comparison/

> heuristic is good enough that we don't seem to loose appreciable
> performance while computing Bloom filters. Applying this patch (compared
> with v2.29.1) speeds up computing Bloom filters by around ~4
> seconds.

Using "~4 seconds" here is odd since there is no baseline. Which
repository did you use?

Previous discussion used relative terms. Something like "speeds up by
a factor of 1.25" or something might be interesting.

> So, avoid the useless 'commit_graph_generation()' while writing by
> instead accessing the slab directly. This returns the newly-computed
> generation numbers, and allows us to avoid the heuristic by directly
> comparing generation numbers.

This introduces some timing restrictions to the ability for this
comparison function. It would be dangerous if someone extracted
the method for another purpose. A comment above these lines could
warn future developers from making that mistake, but they would
probably use the comparison functions in commit.c instead.

> Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
> ---
>  commit-graph.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 06f8dc1d896..caf823295f4 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -144,8 +144,8 @@ static int commit_gen_cmp(const void *va, const void *vb)
>  	const struct commit *a = *(const struct commit **)va;
>  	const struct commit *b = *(const struct commit **)vb;
>  
> -	uint32_t generation_a = commit_graph_generation(a);
> -	uint32_t generation_b = commit_graph_generation(b);
> +	uint32_t generation_a = commit_graph_data_at(a)->generation;
> +	uint32_t generation_b = commit_graph_data_at(b)->generation;
>  	/* lower generation commits first */
>  	if (generation_a < generation_b)
>  		return -1;
> 




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux