Re: [PATCH 07/15] commit-graph: new filter ver. that fixes murmur3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 29, 2023 at 09:31:23AM -0700, Jonathan Tan wrote:
> SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
 
> > > +test_expect_success 'version 2 changed-path used when version 2 requested' '
> > > +	(
> > > +		cd highbit2 &&
> > > +		test_bloom_filters_used "-- $CENT"
> > 
> > This test_bloom_filter_used helper runs two pathspec-limited 'git log'
> > invocations, one with disabled and the other with enabled
> > commit-graph, and thus with disabled and enabled modified path Bloom
> > filters, and compares their output.
> > 
> > One of the flaws of the current modified path Bloom filters
> > implementation is that it doesn't check Bloom filters for root
> > commits.
> > 
> > In several of the above test cases test_bloom_filters_used is invoked
> > in a repository with only a root commit, so they don't check that
> > the output is the same with and without Bloom filters.
> 
> Ah...you are right. Indeed, when I flip one of the tests from
> test_bloom_filters_used to _not_, the test still passes. I'll change
> the tests.

I'd prefer to leave the test cases unchanged, and make the revision
walking machinery look at Bloom filters even for root commits, because
this is an annoying and recurring testing issue.  I remember it
annoyed me back then, when I wanted to reproduce a couple of bugs that
I knew were there, but my initial test cases didn't fail because the
Bloom filter of the root commit was ignored; Derrick overlooked this
in b16a8277644, you overlooked it now, and none of the reviewers then
and now caught it, either.

> > The string "split" occurs twice in this patch series, but only in
> > patch hunk contexts, and it doesn't occur at all in the previous long
> > thread about the original patch series.
> > 
> > Unfortunately, split commit-graphs weren't really considered in the
> > design of the modified path Bloom filters feature, and layers with
> > different Bloom filter settings weren't considered at all.  I've
> > reported it back then, but the fixes so far were incomplete, and e.g.
> > the test cases shown in
> > 
> >   https://public-inbox.org/git/20201015132147.GB24954@xxxxxxxxxx/
> > 
> > still fail.
> > 
> > Since the interaction of different versions and split commit-graphs
> > was neither mentioned in any of the commit messages nor discussed
> > during the previous rounds, and there isn't any test case excercising
> > it, and since the Bloom filter version information is stored in the
> > same 'g->bloom_filter_settings' structure as the number of hashes, I'm
> > afraid (though haven't actually checked) that handling commit-graph
> > layers with different Bloom filter versions is prone to the same
> > issues as well.
> 
> My original design (up to patch 7 in this patch set) defends against
> this by taking the very first version detected and rejecting every
> other version, and Taylor's subsequent design reads every version, but
> annotates filters with its version. So I think we're covered.

In the meantime I adapted the test cases from the above linked message
to write commit-graph layers with different Bloom filter versions, and
it does fail, because commits from the bottom commit-graph layer are
omitted from the revision walk's output.  And the test case doesn't
even need a middle layer without modified path Bloom filters to "hide"
the different version in the bottom layer.  Merging the layers seems
to work, though.

Besides fixing this issue, I think that the interaction of different
Bloom filter versions and split commit-graphs needs to be thoroughly
covered with test cases and discussed in the commit messages before
this series could be considered good for merging.


diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index 48f8109a66..55f67e5110 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -586,4 +586,40 @@ test_expect_success 'when writing commit graph, reuse changed-path of another ve
 	test_filter_upgraded 1 trace2.txt
 '
 
+test_expect_success 'split commit graph vs changed paths breakage - setup' '
+	git init split-breakage &&
+	(
+		cd split-breakage &&
+		git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" &&
+		for i in 1 2 3
+		do
+			echo $i >$CENT &&
+			git add $CENT &&
+			git commit -m "$i" || return 1
+		done &&
+		git log --oneline -- $CENT >expect
+	)
+'
+
+test_expect_failure 'split commit graph vs changed paths breakage - split' '
+	(
+		cd split-breakage &&
+
+		# Compute v1 Bloom filters for the commits at the bottom.
+		git rev-parse HEAD^ | git commit-graph write --stdin-commits --changed-paths --split &&
+		# Compute v2 Bloom filters for the rest of the commits at the top.
+		git rev-parse HEAD | git -c commitgraph.changedPathsVersion=2 commit-graph write --stdin-commits --changed-paths --split=no-merge &&
+
+		# Just to make sure that there are as many graph layers as I
+		# think there should be.
+		test_line_count = 2 .git/objects/info/commit-graphs/commit-graph-chain &&
+
+		# This checks Bloom filters using version information in the
+		# top layer, thus misses commits modifying the file in the
+		# bottom commit-graph layer.
+		git log --oneline -- $CENT >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done

It fails with:

  + cd split-breakage
  + git rev-parse HEAD^
  + git commit-graph write --stdin-commits --changed-paths --split
  + git rev-parse HEAD
  + git -c commitgraph.changedPathsVersion=2 commit-graph write --stdin-commits --changed-paths --split=no-merge
  + test_line_count = 2 .git/objects/info/commit-graphs/commit-graph-chain
  + git log --oneline -- ¢
  + test_cmp expect actual
  --- expect      2023-08-30 19:07:59.882066827 +0000
  +++ actual      2023-08-30 19:07:59.894067148 +0000
  @@ -1,3 +1 @@
   1db2248 3
  -cfcc97f 2
  -bd9c2c8 1
  error: last command exited with $?=1




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux