Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 13, 2024 at 11:51:57PM +0100, SZEDER Gábor wrote:
> On a related note, if current git (I tried current master and v2.43.0)
> encounters a commit graph layer containing v2 Bloom filters (created
> by current seen) while writing a new commit graph, then it segfaults
> dereferencing a NULL 'settings' pointer in
> get_or_compute_bloom_filter().
>
> The test below demonstrates this, but it's quite hacky using two
> different git versions: it has to be run by an old git version not yet
> supporting v2 Bloom filters, and a new git version already supporting
> them should be installed at /tmp/git-new/.
>
> diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
> index 2ba0324a69..0464dd68d5 100755
> --- a/t/t4216-log-bloom.sh
> +++ b/t/t4216-log-bloom.sh
> @@ -454,4 +454,33 @@ test_expect_success 'Bloom reader notices out-of-order index offsets' '
>  	test_cmp expect.err err
>  '
>
> +CENT=$(printf "\302\242")
> +test_expect_success 'split commit graph vs changed paths Bloom filter v2 vs old git' '
> +	git init split-v2-old &&
> +	(
> +		cd split-v2-old &&
> +		git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" &&
> +		for i in 1 2 3
> +		do
> +			echo $i >$CENT &&
> +			git add $CENT &&
> +			git commit -m "$i" || return 1
> +		done &&
> +		git log --oneline -- $CENT >expect &&
> +
> +		# Here we write a commit graph layer containing v2 changed
> +		# path Bloom filters using a git binary built from current
> +		# 'seen' branch.
> +		git rev-parse HEAD^ |
> +		/tmp/git-new/bin/git -c commitgraph.changedPathsVersion=2 \
> +			commit-graph write --stdin-commits --changed-paths --split &&
> +
> +		# This is current master, and segfaults.
> +		git commit-graph write --reachable --changed-paths &&
> +
> +		git log --oneline -- $CENT >actual &&
> +		test_cmp expect actual
> +	)
> +'
> +
>  test_done

Thanks. The segfault is reproducible on my end, but I don't think that
it is possible to fix this for existing versions of Git. The problem (as
you note in your backtrace) is here:

    #0  0x000055555569c842 in get_or_compute_bloom_filter (
        r=0x5555559c9ce0 <the_repo>, c=0x5555559dffd0, compute_if_not_present=1,
        settings=0x0, computed=0x7fffffffe0f4) at bloom.c:253

Which tries to dereference ctx->bloom_settings, which is NULL. Note that
we initialize some sensible defaults for ctx->bloom_settings in
commit-graph.c::write_commit_graph():

    struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS;
    /* ... */
    bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY",
                                                  bloom_settings.bits_per_entry);
    bloom_settings.num_hashes = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_NUM_HASHES",
                                              bloom_settings.num_hashes);
    bloom_settings.max_changed_paths = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS",
                                                     bloom_settings.max_changed_paths);
    ctx->bloom_settings = &bloom_settings;

...but we'll throw those away in favor of whatever is in the topmost
layer of the existing commit-graph chain later on in that same function:

    if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) {
      struct commit_graph *g;

      g = ctx->r->objects->commit_graph;

      /* We have changed-paths already. Keep them in the next graph */
      if (g && g->chunk_bloom_data) {
        ctx->changed_paths = 1;
        ctx->bloom_settings = g->bloom_filter_settings;
      }
    }

OK, everything seems fine thus far, until we inspect the value of
g->bloom_filter_settings, which is NULL, becuase of this hunk from
commit-graph.c::graph_read_bloom_data():

    if (hash_version != 1)
      return 0;

which terminates the function before we assign g->bloom_filter_settings
for the existing (written with v2 Bloom filters) graph layer.

I don't think that there is a way to fix this in a backwards compatible
way, but I'm comfortable with that in this instance since we don't
expect users to upgrading to v2 Bloom filters and then writing new graph
layers using a non-v2 compatible version of Git.

We can add a warning in the series that I'm working on indicating this,
but I don't think there's much more we can do besides changing this to
indicate a warning and bailing instead of segfaulting.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux