Re: [PATCH 07/15] commit-graph: new filter ver. that fixes murmur3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
> > @@ -66,7 +66,64 @@ int load_bloom_filter_from_graph(struct commit_graph *g,
> >   * Not considered to be cryptographically secure.
> >   * Implemented as described in https://en.wikipedia.org/wiki/MurmurHash#Algorithm
> >   */
> > -uint32_t murmur3_seeded(uint32_t seed, const char *data, size_t len)
> > +uint32_t murmur3_seeded_v2(uint32_t seed, const char *data, size_t len)
> 
> Nit: MurmurHash3 implementations in C/C++ (apart from the sample
> implementation on Wikipedia), and other hash functions taking data
> pointer and buffer size parameters in general, have a 'const void
> *data' parameter, not 'const char*'.

I think either works, so I'll stick with what the existing code uses.
(Well, we probably should have used "unsigned char" in the first place.)

> > +test_expect_success 'when writing another commit graph, preserve existing version 1 of changed-path' '
> > +	test_commit -C highbit1 c1double "$CENT$CENT" &&
> > +	git -C highbit1 commit-graph write --reachable --changed-paths &&
> 
> Nit: Since there is a subshell cd-ing into the 'highbit1' directory
> anyway, it would look clearer to put these two commands into that
> subshell as well.
> This applies to some of the later test cases as well.

Makes sense, but this patch series has already been reviewed a few times
so I don't think it's worth making this change at this point in the
review process for a small increase in readability.

> > +test_expect_success 'version 2 changed-path used when version 2 requested' '
> > +	(
> > +		cd highbit2 &&
> > +		test_bloom_filters_used "-- $CENT"
> 
> This test_bloom_filter_used helper runs two pathspec-limited 'git log'
> invocations, one with disabled and the other with enabled
> commit-graph, and thus with disabled and enabled modified path Bloom
> filters, and compares their output.
> 
> One of the flaws of the current modified path Bloom filters
> implementation is that it doesn't check Bloom filters for root
> commits.
> 
> In several of the above test cases test_bloom_filters_used is invoked
> in a repository with only a root commit, so they don't check that
> the output is the same with and without Bloom filters.

Ah...you are right. Indeed, when I flip one of the tests from
test_bloom_filters_used to _not_, the test still passes. I'll change
the tests.

> The string "split" occurs twice in this patch series, but only in
> patch hunk contexts, and it doesn't occur at all in the previous long
> thread about the original patch series.
> 
> Unfortunately, split commit-graphs weren't really considered in the
> design of the modified path Bloom filters feature, and layers with
> different Bloom filter settings weren't considered at all.  I've
> reported it back then, but the fixes so far were incomplete, and e.g.
> the test cases shown in
> 
>   https://public-inbox.org/git/20201015132147.GB24954@xxxxxxxxxx/
> 
> still fail.
> 
> Since the interaction of different versions and split commit-graphs
> was neither mentioned in any of the commit messages nor discussed
> during the previous rounds, and there isn't any test case excercising
> it, and since the Bloom filter version information is stored in the
> same 'g->bloom_filter_settings' structure as the number of hashes, I'm
> afraid (though haven't actually checked) that handling commit-graph
> layers with different Bloom filter versions is prone to the same
> issues as well.

My original design (up to patch 7 in this patch set) defends against
this by taking the very first version detected and rejecting every
other version, and Taylor's subsequent design reads every version, but
annotates filters with its version. So I think we're covered.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux