Re: [PATCH 1/6] commit-graph: fix regression when computing bloom filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 04, 2020 at 02:46:55AM +0200, Jakub Narębski wrote:
> "Abhishek Kumar via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>
> > From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
> >
> > With 3d112755 (commit-graph: examine commits by generation number), Git
> > knew to sort by generation number before examining the diff when not
> > using pack order. c49c82aa (commit: move members graph_pos, generation
> > to a slab, 2020-06-17) moved generation number into a slab and
> > introduced a helper which returns GENERATION_NUMBER_INFINITY when
> > writing the graph. Sorting is no longer useful and essentially reverts
> > the earlier commit.
> >
> > Let's fix this by accessing generation number directly through the slab.
>
> It looks like unfortunate and unforeseen consequence of putting together
> graph position and generation number in the commit_graph_data struct.
> During writing of the commit-graph file generation number is computed,
> but graph position is undefined (yet), and commit_graph_generation()
> uses graph_pos field to find if the data for commit is initialized;
> in this case wrongly.
>
> Anyway, when writing the commit graph we first compute generation
> number, then (if requested) the changed-paths Bloom filter.  Skipping
> the unnecessary check is a good thing... assuming that commit_gen_cmp()
> is used only when writing the commit graph, and not when traversing it
> (because then some commits may not have generation number set, and maybe
> even do not have any data on the commit slab) - which is the case.
>
> >
> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
> > ---
> >  commit-graph.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 1af68c297d..5d3c9bd23c 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
>
> We might want to add function comment either here or in the header that
> this comparisonn function is to be used only for `git commit-graph
> write`, and not for graph traversal (even if similar funnction exists in
> other modules).

I think that probably within the function is just fine, and that we can
avoid touching commit-graph.h here.

>
> > @@ -144,8 +144,9 @@ static int commit_gen_cmp(const void *va, const void *vb)
> >  	const struct commit *a = *(const struct commit **)va;
> >  	const struct commit *b = *(const struct commit **)vb;

Maybe something like:

  /*
   * Access the generation number directly with
   * 'commit_graph_data_at(...)->generation' instead of going through
   * the slab as usual to avoid accessing a yet-uncomputed value.
   */

Folks that are curious for more can blame this commit and read there.
I'd err on the side of being brief in the code comment and verbose in
the commit message than the other way around ;).

> >
> > -	uint32_t generation_a = commit_graph_generation(a);
> > -	uint32_t generation_b = commit_graph_generation(b);
> > +	uint32_t generation_a = commit_graph_data_at(a)->generation;
> > +	uint32_t generation_b = commit_graph_data_at(b)->generation;
> > +
> >  	/* lower generation commits first */
> >  	if (generation_a < generation_b)
> >  		return -1;
>
> Best,
> --
> Jakub Narębski

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux