On Sat, Jan 23, 2021 at 11:58 AM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > This is based on ds/cache-tree-basics. > > Here are a few more cleanups that are vaguely related to the index. I > discovered these while preparing my sparse-index RFC that I intend to send > early next week. > > The biggest patch is the final one, which creates a test script for > comparing sparse-checkouts to full checkouts. There are some commands that > do not behave similarly. This script will be the backbone of my testing > strategy for the sparse-index by adding a new mode to compare > sparse-checkouts with the two index types (full and sparse). > > > UPDATES IN V3 > ============= > > * Callers to cache_tree_update() no longer initialize the cache_tree in > advance. > > * Added a patch to update verify_cache() prototype. > > * Added missing "pos + 1" in fsmonitor.c. > > * Added a BUG() statement when repo->istate->repo is already populated, but > not equal to repo. > > * Cleaned up test_region pattern quoting. Thanks, Junio! > > Thanks, -Stolee > > Derrick Stolee (9): > cache-tree: clean up cache_tree_update() > cache-tree: simplify verify_cache() prototype > cache-tree: extract subtree_pos() > fsmonitor: de-duplicate BUG()s around dirty bits > repository: add repo reference to index_state > name-hash: use trace2 regions for init > sparse-checkout: load sparse-checkout patterns > test-lib: test_region looks for trace2 regions > t1092: test interesting sparse-checkout scenarios > > builtin/checkout.c | 3 - > builtin/sparse-checkout.c | 5 - > cache-tree.c | 38 +-- > cache-tree.h | 2 + > cache.h | 1 + > dir.c | 17 ++ > dir.h | 2 + > fsmonitor.c | 27 +- > name-hash.c | 3 + > repository.c | 6 + > sequencer.c | 3 - > t/t0500-progress-display.sh | 3 +- > t/t1092-sparse-checkout-compatibility.sh | 301 +++++++++++++++++++++++ > t/test-lib-functions.sh | 42 ++++ > unpack-trees.c | 8 +- > 15 files changed, 408 insertions(+), 53 deletions(-) > create mode 100755 t/t1092-sparse-checkout-compatibility.sh > > > base-commit: a4b6d202caad83c6dc29abe9b17e53a1b3fb54a0 > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-839%2Fderrickstolee%2Fmore-index-cleanups-v3 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-839/derrickstolee/more-index-cleanups-v3 > Pull-Request: https://github.com/gitgitgadget/git/pull/839 > > Range-diff vs v2: > > 1: f9dccaed0ac ! 1: bdc8ecca3d2 cache-tree: clean up cache_tree_update() > @@ Commit message > BUG() statement or returning with an error because future callers will > want to populate an empty cache-tree using this method. > > - Also drop local variables that are used exactly once and can be found > - directly from the 'istate' parameter. > + Callers can also remove their conditional allocations of cache_tree. > + > + Also drop local variables that can be found directly from the 'istate' > + parameter. > > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > + ## builtin/checkout.c ## > +@@ builtin/checkout.c: static int merge_working_tree(const struct checkout_opts *opts, > + } > + } > + > +- if (!active_cache_tree) > +- active_cache_tree = cache_tree(); > +- > + if (!cache_tree_fully_valid(active_cache_tree)) > + cache_tree_update(&the_index, WRITE_TREE_SILENT | WRITE_TREE_REPAIR); > + > + > ## cache-tree.c ## > @@ cache-tree.c: static int update_one(struct cache_tree *it, > > @@ cache-tree.c: static int update_one(struct cache_tree *it, > trace2_region_leave("cache_tree", "update", the_repository); > trace_performance_leave("cache_tree_update"); > if (i < 0) > +@@ cache-tree.c: static int write_index_as_tree_internal(struct object_id *oid, > + cache_tree_valid = 0; > + } > + > +- if (!index_state->cache_tree) > +- index_state->cache_tree = cache_tree(); > +- > + if (!cache_tree_valid && cache_tree_update(index_state, flags) < 0) > + return WRITE_TREE_UNMERGED_INDEX; > + > + > + ## sequencer.c ## > +@@ sequencer.c: static int do_recursive_merge(struct repository *r, > + > + static struct object_id *get_cache_tree_oid(struct index_state *istate) > + { > +- if (!istate->cache_tree) > +- istate->cache_tree = cache_tree(); > +- > + if (!cache_tree_fully_valid(istate->cache_tree)) > + if (cache_tree_update(istate, 0)) { > + error(_("unable to update cache tree")); > + > + ## unpack-trees.c ## > +@@ unpack-trees.c: int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options > + if (!ret) { > + if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0)) > + cache_tree_verify(the_repository, &o->result); > +- if (!o->result.cache_tree) > +- o->result.cache_tree = cache_tree(); > + if (!cache_tree_fully_valid(o->result.cache_tree)) > + cache_tree_update(&o->result, > + WRITE_TREE_SILENT | > -: ----------- > 2: 1b8b5680094 cache-tree: simplify verify_cache() prototype > 2: 84323e04d08 = 3: 314b6b34f75 cache-tree: extract subtree_pos() > 3: 31095f9aa0e ! 4: 4e688d25f8c fsmonitor: de-duplicate BUG()s around dirty bits > @@ Commit message > cannot simplify it too much. However, the error string is identical in > each, so this simplifies things. > > + Be sure to add one when checking if a position if valid, since the > + minimum is a bound on the expected size. > + > The end result is that the code is simpler to read while also preserving > these assertions for developers in the FSMonitor space. > > @@ fsmonitor.c > - if (pos >= istate->cache_nr) > - BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" >= %u)", > - (uintmax_t)pos, istate->cache_nr); > -+ assert_index_minimum(istate, pos); > ++ assert_index_minimum(istate, pos + 1); > > ce = istate->cache[pos]; > ce->ce_flags &= ~CE_FSMONITOR_VALID; > 4: a0d89d7a973 ! 5: 6373997e05c repository: add repo reference to index_state > @@ Commit message > repository, add a 'repo' pointer to struct index_state that allows > access to this repository. > > + Add a BUG() statement if the repo already has an index, and the index > + already has a repo, but somehow the index points to a different repo. > + > This will prevent future changes from needing to pass an additional > 'struct repository *repo' parameter and instead rely only on the 'struct > index_state *istate' parameter. > @@ repository.c: int repo_read_index(struct repository *repo) > + /* Complete the double-reference */ > + if (!repo->index->repo) > + repo->index->repo = repo; > ++ else if (repo->index->repo != repo) > ++ BUG("repo's index should point back at itself"); > + > return read_index_from(repo->index, repo->index_file, repo->gitdir); > } > 5: bc092f5c703 = 6: 9b545d7dbec name-hash: use trace2 regions for init > 6: 04d1daf7222 = 7: 554cc7647e6 sparse-checkout: load sparse-checkout patterns > 7: 8832ce84623 ! 8: b37181bdec4 test-lib: test_region looks for trace2 regions > @@ t/test-lib-functions.sh: test_subcommand () { > + shift > + fi > + > -+ grep -e "\"region_enter\".*\"category\":\"$1\",\"label\":\"$2\"" "$3" > ++ grep -e '"region_enter".*"category":"'"$1"'","label":"'"$2"\" "$3" > + exitcode=$? > + > -+ if test $exitcode != $expect_exit > ++ if test $exitcode != $expect_exit = 1] I don't understand this change. Is it even valid code? What does it mean? > + then > + return 1 > + fi > + > -+ grep -e "\"region_leave\".*\"category\":\"$1\",\"label\":\"$2\"" "$3" > ++ grep -e '"region_leave".*"category":"'"$1"'","label":"'"$2"\" "$3" > + exitcode=$? > + > -+ if test $exitcode != $expect_exit > ++ if test $exitcode != $expect_exit = 1] Same comment. > + then > + return 1 > + fi > ++ > ++ return 0 > +} > 8: 984458007ed ! 9: 72f925353d3 t1092: test interesting sparse-checkout scenarios > @@ t/t1092-sparse-checkout-compatibility.sh (new) > + echo a >a && > + echo "after deep" >e && > + echo "after folder1" >g && > ++ echo "after x" >z && > + mkdir folder1 folder2 deep x && > + mkdir deep/deeper1 deep/deeper2 && > + mkdir deep/deeper1/deepest && > @@ t/t1092-sparse-checkout-compatibility.sh (new) > + echo "after deepest" >deep/deeper1/e && > + cp a folder1 && > + cp a folder2 && > ++ cp a x && > + cp a deep && > + cp a deep/deeper1 && > + cp a deep/deeper2 && > + cp a deep/deeper1/deepest && > ++ cp -r deep/deeper1/deepest deep/deeper2 && > + git add . && > + git commit -m "initial commit" && > + git checkout -b base && > Having read the previous rounds, the rest of the range-diff looks good to me; I sent out separate comments on the new patch.