Re: [PATCH v4 04/10] commit-graph: persist existence of changed-paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 01, 2020 at 01:27:24PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> 
> The changed-path Bloom filters were released in v2.27.0, but have a
> significant drawback. A user can opt-in to writing the changed-path
> filters using the "--changed-paths" option to "git commit-graph write"
> but the next write will drop the filters unless that option is
> specified.
> 
> This becomes even more important when considering the interaction with
> gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of
> features.experimental). These config options trigger commit-graph writes
> that the user did not signal, and hence there is no --changed-paths
> option available.
> 
> Allow a user that opts-in to the changed-path filters to persist the
> property of "my commit-graph has changed-path filters" automatically. A
> user can drop filters using the --no-changed-paths option.

The above parts of the commit message and the corresponding changes
are OK, but ...

> In the process, we need to be extremely careful to match the Bloom
> filter settings as specified by the commit-graph. This will allow future
> versions of Git to customize these settings, and the version with this
> change will persist those settings as commit-graphs are rewritten on
> top.

As pointed out in my original bug report [1], modified path Bloom
filters are computed with hardcoded settings in
bloom.c:get_bloom_filter().  Since this patch does not touch bloom.c
at all, it still computes Bloom filters with those hardcoded settings,
and, consequently, despite the commit message's claims, it does not
persist the settings in the existing commit-graph.

[1] https://public-inbox.org/git/20200619140230.GB22200@xxxxxxxxxx/

> Use the trace2 API to signal the settings used during the write, and
> check that output in a test after manually adjusting the correct bytes
> in the commit-graph file.

This test is insufficient, as it only checks what settings trace2
believes the Bloom filters are computed with, not what settings they
are actually computed with; that's why it succeeded while the bug
whose absence it was supposed to ensure was still there.

More robust tests should instead look at what actually gets written to
the commit-graph, and how that is interpreted during pathspec-limited
revision walks.

> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>

A "Reported-by: me" trailer would have been appropriate here.

> ---
>  Documentation/git-commit-graph.txt |  5 +++-
>  builtin/commit-graph.c             |  5 +++-
>  commit-graph.c                     | 45 ++++++++++++++++++++++++++++--
>  commit-graph.h                     |  1 +
>  t/t4216-log-bloom.sh               | 17 ++++++++++-
>  5 files changed, 67 insertions(+), 6 deletions(-)

Anyway, this is now partially fixed in 9a7a9ed10d (bloom: use provided
'struct bloom_filter_settings', 2020-09-16), though, unfortunately,
its commit message is not quite clear on this.  Alas, that's only a
partial fix, because we still only look at the top level commit-graph
file for existing Bloom filter settings.  However, deeper commit-graph
layers can contain Bloom filters with non-default settings even when
the top level doesn't, and these failing tests below demonstrate:

  ---  >8  ---

#!/bin/sh

test_description='test'

. ./test-lib.sh

test_expect_success 'setup' '
	git commit --allow-empty -m "Bloom filters are written but ignored for root commits :(" &&
	for i in 1 2 3
	do
		echo $i >file &&
		git add file &&
		git commit -m "$i" || return 1
	done &&
	git log --oneline --no-decorate -- file >expect
'

test_expect_success 'split' '
	# Compute Bloom filters with "unusual" settings.
	git rev-parse HEAD^^ | GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=3 git commit-graph write --stdin-commits --changed-paths --split &&
	# A commit-graph layer without Bloom filters "hides" the layers
	# below ...
	git rev-parse HEAD^ | git commit-graph write --stdin-commits --no-changed-paths --split=no-merge &&
	# ... so this does not look at existing Bloom filters and their
	# settings in the bottom commit-graph layer and computes new
	# Bloom filters using the default 7 hashes.
	git rev-parse HEAD | git commit-graph write --stdin-commits --changed-paths --split=no-merge &&

	# Just to make sure that there are as many graph layers as I
	# think there should be.
	test_line_count = 3 .git/objects/info/commit-graphs/commit-graph-chain &&

	# This checks Bloom filters using settings in the top layer,
	# thus misses commits modifying file in the bottom commit-graph
	# layer.
	git log --oneline --no-decorate -- file >actual &&
	test_cmp expect actual
'

test_expect_success 'merged' '
	# This merges all existing layers, and computes missing Bloom
	# filters with the settings in the top layer, without noticing
	# that filters in the bottom layer were computed with different
	# settings.
	git commit-graph write --reachable --changed-paths &&

	# Just to make sure...
	test_path_is_file .git/objects/info/commit-graph &&

	# This misses commits modifying file that were merged from the
	# bottom commit-graph layer.
	git log --oneline --no-decorate -- file >actual &&
	test_cmp expect actual
'

test_done




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux