Re: [PATCH] shallow: reset commit grafts when shallow is reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not an expert on shallows or commit grafts, so I hesitate to give a
"Reviewed-by", but the description of the bug and fix make sense.

Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:

> When reset_repository_shallow() is called, Git clears its cache of
> shallow information, so that if shallow information is re-requested, Git
> will read fresh data from disk instead of reusing its stale cached data.
> However, the cache of commit grafts is not likewise cleared, even though
> there are commit grafts created from shallow information.
>
> This means that if on-disk shallow information were to be updated and
> then a commit-graft-using codepath were run (for example, a revision
> walk), Git would be using stale commit graft information. This can be
> seen from the test in this patch, in which Git performs a revision walk
> (to check for changed submodules) after a fetch with --update-shallow.

Hm, as a non-expert, I wasn't able to immediately draw the link between
"reset_repository_shallow() doesn't do enough" to "we reproduce this bug
using fetch --update-shallow with submodules".

What would make this sequence of events clearer to me is describing
why reset_repository_shallow() is called. From reading receive-pack.c
and shallow.c, I believe we call reset_repository_shallow() whenever a
shallow file is written (or more specifically, whenever we commit or
rollback a shallow lock file), invalidating the cache of shallow
information.

One thing I noticed when writing my test case is that fetch without
--update-shallow would never run into this bug. But if one reads 
Documentation/fetch-options.txt:

  --update-shallow::
    By default when fetching from a shallow repository,
    `git fetch` refuses refs that require updating
    .git/shallow. This option updates .git/shallow and accept such
    refs.

it becomes pretty easy to put two and two together - fetch with
--update-shallow writes shallow files, clearing the cache of
shallows (but not commit grafts), so it thinks it has all of the correct
commit graft info. It then tries to perform a rev walk and fails.

It is also worth noting that we only do this rev walk if Git thinks that
there are submodules, as indicated by the presence of .gitmodules (or
after gc/fetch-unused-submodules, whether .git/modules is nonempty).
This would also explain why we never noticed the bug before - it only
occurs when we have both --update-shallow and submodules.

> Therefore, clear the cache of commit grafts whenever
> reset_repository_shallow() is called.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
> ---
> This is the same bug noticed by Glen Choo, as can be seen by the patch
> here [1]. (That patch doesn't seem to be carried forward to subsequent
> versions of the patch set, presumably because it is not needed in that
> patch set.)
>
> I have verified that Glen's test also works with the fix in this patch.
> But I have written a test in t5537-fetch-shallow instead, as I think
> that this is more about updating shallows than fetching submodules.

Makes sense, this is more about shallows.

> [1] https://lore.kernel.org/git/20220215172318.73533-9-chooglen@xxxxxxxxxx/
> ---
>  commit.c                 | 10 ++++++++++
>  commit.h                 |  1 +
>  shallow.c                |  1 +
>  submodule.c              |  1 +
>  t/t5537-fetch-shallow.sh |  9 +++++++++
>  5 files changed, 22 insertions(+)
>
> diff --git a/commit.c b/commit.c
> index d400f5dfa2..f8405fe46e 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -249,6 +249,16 @@ int for_each_commit_graft(each_commit_graft_fn fn, void *cb_data)
>  	return ret;
>  }
>  
> +void reset_commit_grafts(struct repository *r)
> +{
> +	int i;
> +
> +	for (i = 0; i < r->parsed_objects->grafts_nr; i++)
> +		free(r->parsed_objects->grafts[i]);
> +	r->parsed_objects->grafts_nr = 0;
> +	r->parsed_objects->commit_graft_prepared = 0;
> +}
> +
>  struct commit_buffer {
>  	void *buffer;
>  	unsigned long size;
> diff --git a/commit.h b/commit.h
> index 38cc542661..336e0bfa42 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -249,6 +249,7 @@ int commit_graft_pos(struct repository *r, const struct object_id *oid);
>  int register_commit_graft(struct repository *r, struct commit_graft *, int);
>  void prepare_commit_graft(struct repository *r);
>  struct commit_graft *lookup_commit_graft(struct repository *r, const struct object_id *oid);
> +void reset_commit_grafts(struct repository *r);
>  
>  struct commit *get_fork_point(const char *refname, struct commit *commit);
>  
> diff --git a/shallow.c b/shallow.c
> index 71e5876f37..e158be58b0 100644
> --- a/shallow.c
> +++ b/shallow.c
> @@ -90,6 +90,7 @@ static void reset_repository_shallow(struct repository *r)
>  {
>  	r->parsed_objects->is_shallow = -1;
>  	stat_validity_clear(r->parsed_objects->shallow_stat);
> +	reset_commit_grafts(r);
>  }
>  

Makes sense. [un]register_shallow(), register and unregister commit
grafts correctly, so it makes sense that reset_repository_shallow()
should reset the commit grafts too. I don't see any existing functions
that reset commit grafts.

>  int commit_shallow_file(struct repository *r, struct shallow_lock *lk)
> diff --git a/submodule.c b/submodule.c
> index 5ace18a7d9..7a0515913c 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -22,6 +22,7 @@
>  #include "parse-options.h"
>  #include "object-store.h"
>  #include "commit-reach.h"
> +#include "shallow.h"
>  
>  static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
>  static int initialized_fetch_ref_tips;
> diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
> index 11d5ea54a9..92948de7a0 100755
> --- a/t/t5537-fetch-shallow.sh
> +++ b/t/t5537-fetch-shallow.sh
> @@ -161,6 +161,15 @@ test_expect_success 'fetch --update-shallow' '
>  	)
>  '
>  
> +test_expect_success 'fetch --update-shallow into a repo with submodules' '
> +	git init a-submodule &&
> +	test_commit -C a-submodule foo &&
> +	git init repo-with-sub &&
> +	git -C repo-with-sub submodule add ../a-submodule a-submodule &&
> +	git -C repo-with-sub commit -m "added submodule" &&
> +	git -C repo-with-sub fetch --update-shallow ../shallow/.git refs/heads/*:refs/remotes/shallow/*
> +'
> +
>  test_expect_success 'fetch --update-shallow (with fetch.writeCommitGraph)' '
>  	(
>  	cd shallow &&
> -- 
> 2.35.1.894.gb6a874cedc-goog



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux