On Fri, Jul 23, 2021 at 11:55 AM Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote: > > "git read-tree" checks the existence of the blobs referenced by the > given tree, but does not bulk prefetch them. Add a bulk prefetch. > > The lack of prefetch here was noticed at $DAYJOB during a merge > involving some specific commits, but I couldn't find a minimal merge > that didn't also trigger the prefetch in check_updates() in > unpack-trees.c (and in all these cases, the lack of prefetch in > cache-tree.c didn't matter because all the relevant blobs would have > already been prefetched by then). This is why I used read-tree here to > exercise this code path. Okay, you have me stumped, I can't figure out what kind of a merge would bypass the check_updates() in unpack-trees.c either. I was curious about octopus or merge.autostash, but I just can't trigger it. Using read-tree to trigger the case makes perfect sense, though. > Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> > --- > cache-tree.c | 11 ++++++++-- > t/t1022-read-tree-partial-clone.sh | 33 ++++++++++++++++++++++++++++++ > 2 files changed, 42 insertions(+), 2 deletions(-) > create mode 100755 t/t1022-read-tree-partial-clone.sh > > diff --git a/cache-tree.c b/cache-tree.c > index 45e58666af..9ba2c7c6b2 100644 > --- a/cache-tree.c > +++ b/cache-tree.c > @@ -237,6 +237,11 @@ int cache_tree_fully_valid(struct cache_tree *it) > return 1; > } > > +static int must_check_existence(const struct cache_entry *ce) > +{ > + return !(has_promisor_remote() && ce_skip_worktree(ce)); > +} > + > static int update_one(struct cache_tree *it, > struct cache_entry **cache, > int entries, > @@ -378,8 +383,7 @@ static int update_one(struct cache_tree *it, > } > > ce_missing_ok = mode == S_IFGITLINK || missing_ok || > - (has_promisor_remote() && > - ce_skip_worktree(ce)); > + !must_check_existence(ce); > if (is_null_oid(oid) || > (!ce_missing_ok && !has_object_file(oid))) { > strbuf_release(&buffer); > @@ -466,6 +470,9 @@ int cache_tree_update(struct index_state *istate, int flags) > if (!istate->cache_tree) > istate->cache_tree = cache_tree(); > > + if (!(flags & WRITE_TREE_MISSING_OK) && has_promisor_remote()) > + prefetch_cache_entries(istate, must_check_existence); > + Nice that the fix is so simple. > trace_performance_enter(); > trace2_region_enter("cache_tree", "update", the_repository); > i = update_one(istate->cache_tree, istate->cache, istate->cache_nr, > diff --git a/t/t1022-read-tree-partial-clone.sh b/t/t1022-read-tree-partial-clone.sh > new file mode 100755 > index 0000000000..a763e27c7d > --- /dev/null > +++ b/t/t1022-read-tree-partial-clone.sh > @@ -0,0 +1,33 @@ > +#!/bin/sh > + > +test_description='git read-tree in partial clones' > + > +TEST_NO_CREATE_REPO=1 > + > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > + > +. ./test-lib.sh > + > +test_expect_success 'read-tree in partial clone prefetches in one batch' ' > + test_when_finished "rm -rf server client trace" && > + > + git init server && > + echo foo >server/one && > + echo bar >server/two && > + git -C server add one two && > + git -C server commit -m "initial commit" && > + TREE=$(git -C server rev-parse HEAD^{tree}) && > + > + git -C server config uploadpack.allowfilter 1 && > + git -C server config uploadpack.allowanysha1inwant 1 && > + git clone --bare --filter=blob:none "file://$(pwd)/server" client && > + GIT_TRACE_PACKET="$(pwd)/trace" git -C client read-tree $TREE && > + > + # "done" marks the end of negotiation (once per fetch). Expect that > + # only one fetch occurs. > + grep "fetch> done" trace >donelines && > + test_line_count = 1 donelines > +' > + > +test_done > -- > 2.32.0.432.gabb21c7263-goog Any reason for preferring GIT_TRACE_PACKET over GIT_TRACE2_PERF and looking for the reported fetch_count (or even the number of fetch_count lines)? Just curious. Anyway, looks good to me.