On Tue, Mar 1, 2022 at 12:24 PM Victoria Dye via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > Like previous integrations [1] [2], this series allows 'git read-tree' to > operate using a sparse index. > > The first three patches are bugfixes for issues found while implementing the > 'read-tree' integration: > > * The first (patch [1/8]) fixes an edge case in which a repo with no > in-cone files or directories would have its root collapsed into a sparse > directory; the fix ensures the root directory is never collapsed into a > sparse directory. > * The second (patch [2/8]) corrects the 'git status' reporting of changes > nested inside the subdirectory of a sparse directory, ensuring that the > modified file (not the subdirectory) is correctly reported as having > changes. > * The third (patch [3/8]) explicitly disallows the somewhat ambiguous case > of using 'git read-tree --prefix=/' to represent the repo root (it could > be interpreted as a leading slash - even though it's actually trailing > slash - which is not allowed for prefixes). Now, an error specifying that > prefixes cannot begin with '/' guides users more directly to the correct > syntax. If a user does want to specify the repo root as their prefix, > that can still be done with 'git read-tree --prefix=' > > The remainder of the series focuses on utilizing the sparse index in 'git > read-tree'. After some baseline behavior-establishing tests (patch [4/8]), > sparse index usage is trivially enabled (patch [5/8]) for 'read-tree' > except: > > * usage with '--prefix' > * two- and three-way merge > > These cases require additional changes in order to work as expected (i.e., > outwardly matching non-sparse index sparse-checkout). For the former, the > sparse index can be enabled as long as the index is expanded when the prefix > is a directory outside the sparse cone (patch [6/8]). For the latter, sparse > directories that cannot be trivially merged must have their contents merged > file-by-file, done by recursively traversing the trees represented by the > sparse directories (patches [7/8] & [8/8]). > > > Changes since V2 > ================ > > * revised/clarified explanation for why the 'recursive' flag is necessary > in 'git status' internal diff in both the commit message for [2/8] and > the code comments > * added patch [3/8] to disallow '/' as a prefix > * moved the loop over different 'read-tree' scenarios outside the test in > t1092 test 'read-tree --merge with files outside sparse definition', > separating the tests into individually-runnable and traceable cases > * improved test case clarity and variety in 'read-tree --prefix' tests > * removed redundant arg to 'git reset' in 'sparse index is not expanded: > read-tree' > * pointed out pre-existing testing that covers '--prefix' in [6/8] commit > message > * moved prefix-dependent index expansion logic into function > 'update_sparsity_for_prefix', improved explanation for function behavior > in comments > * pointed out pre-existing testing that covers two- and three-way merge in > [7/8] & [8/8] commit messages, respectively > * added test cases for "ensure not expanded" tests of read-tree for both > two- and three-way merge > * fixed issue with three-way merge implementation (if the "no merge" cases > described in 't/t1000-read-tree-m-3way.sh' are reached, sparse > directories should be recursively merged), testing added in the "ensure > not expanded" cases I've read through all of these. This round is: Reviewed-by: Elijah Newren <newren@xxxxxxxxx> > > > Changes since V1 > ================ > > * switched an empty string check from '!strlen(path)' to the > slightly-less-expensive '!*path' > > Thanks! > > * Victoria > > [1] > https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@xxxxxxxxx/ > [2] > https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@xxxxxxxxx/ > > Victoria Dye (8): > sparse-index: prevent repo root from becoming sparse > status: fix nested sparse directory diff in sparse index > read-tree: explicitly disallow prefixes with a leading '/' > read-tree: expand sparse checkout test coverage > read-tree: integrate with sparse index > read-tree: narrow scope of index expansion for '--prefix' > read-tree: make two-way merge sparse-aware > read-tree: make three-way merge sparse-aware > > builtin/read-tree.c | 14 ++- > dir.c | 7 +- > t/perf/p2000-sparse-operations.sh | 1 + > t/t1003-read-tree-prefix.sh | 10 ++ > t/t1092-sparse-checkout-compatibility.sh | 133 ++++++++++++++++++++ > unpack-trees.c | 147 +++++++++++++++++++++-- > wt-status.c | 9 ++ > 7 files changed, 308 insertions(+), 13 deletions(-) > > > base-commit: e6ebfd0e8cbbd10878070c8a356b5ad1b3ca464e > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1157%2Fvdye%2Fsparse%2Fread-tree-v3 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1157/vdye/sparse/read-tree-v3 > Pull-Request: https://github.com/gitgitgadget/git/pull/1157 > > Range-diff vs v2: > > 1: 744668eeece = 1: 744668eeece sparse-index: prevent repo root from becoming sparse > 2: f0cff03b95d ! 2: 26f4d30bd95 status: fix nested sparse directory diff in sparse index > @@ Metadata > ## Commit message ## > status: fix nested sparse directory diff in sparse index > > - Add the 'recursive' flag to 'wt_status_collect_changes_index(...)'. Without > - the 'recursive' flag, 'git status' could report index changes incorrectly > - when the following conditions were met: > + Enable the 'recursive' diff option for the diff executed as part of 'git > + status'. Without the 'recursive' enabled, 'git status' reports index > + changes incorrectly when the following conditions were met: > > * sparse index is enabled > * there is a difference between the index and HEAD in a file inside a > *subdirectory* of a sparse directory > * the sparse directory index entry is *not* expanded in-core > > - In this scenario, 'git status' would not recurse into the sparse directory's > - subdirectories to identify which file contained the difference between the > - index and HEAD. Instead, it would report the immediate subdirectory itself > - as "modified". > + Because it is not recursive by default, the diff in 'git status' reports > + changes only at the level of files and directories that are immediate > + children of a sparse directory, rather than recursing into directories with > + changes to identify the modified file(s). As a result, 'git status' reports > + the immediate subdirectory itself as "modified". > > Example: > > @@ Commit message > (use "git restore --staged <file>..." to unstage) > modified: sparse/sub > > - The 'recursive' diff option in 'wt_status_collect_changes_index' corrects > - this by indicating that 'git status' should recurse into sparse directories > - to find modified files. Given the same repository setup as the example > - above, the corrected result of `git status` is: > + Enabling the 'recursive' diff option in 'wt_status_collect_changes_index' > + corrects this issue by allowing the diff to recurse into subdirectories of > + sparse directories to find modified files. Given the same repository setup > + as the example above, the corrected result of `git status` is: > > $ git status > On branch master > @@ wt-status.c: static void wt_status_collect_changes_index(struct wt_status *s) > rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score; > + > + /* > -+ * The `recursive` option must be enabled to show differences in files > -+ * *more than* one level deep in a sparse directory index entry (e.g., given > -+ * sparse directory 'sparse-dir/', reporting a difference in the file > -+ * 'sparse-dir/another-dir/my-file'). > ++ * The `recursive` option must be enabled to allow the diff to recurse > ++ * into subdirectories of sparse directory index entries. If it is not > ++ * enabled, a subdirectory containing file(s) with changes is reported > ++ * as "modified", rather than the modified files themselves. > + */ > + rev.diffopt.flags.recursive = 1; > + > -: ----------- > 3: e48a281a4d3 read-tree: explicitly disallow prefixes with a leading '/' > 3: ffe0b6aff2b ! 4: 90ebcb7b8ff read-tree: expand sparse checkout test coverage > @@ Commit message > emphasis is placed on interactions with files outside the sparse cone, e.g. > merges with out-of-cone conflicts. > > + Helped-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> > > ## t/perf/p2000-sparse-operations.sh ## > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'update-index --ca > test_cmp expect sparse-checkout-out > ' > > -+test_expect_success 'read-tree --merge with files outside sparse definition' ' > -+ init_repos && > -+ > -+ test_all_match git checkout -b test-branch update-folder1 && > -+ for MERGE_TREES in "base HEAD update-folder2" \ > -+ "update-folder1 update-folder2" \ > -+ "update-folder2" > -+ do > -+ # Clean up and remove on-disk files > -+ test_all_match git reset --hard HEAD && > -+ test_sparse_match git sparse-checkout reapply && > ++for MERGE_TREES in "base HEAD update-folder2" \ > ++ "update-folder1 update-folder2" \ > ++ "update-folder2" > ++do > ++ test_expect_success "'read-tree -mu $MERGE_TREES' with files outside sparse definition" ' > ++ init_repos && > + > + # Although the index matches, without --no-sparse-checkout, outside-of- > + # definition files will not exist on disk for sparse checkouts > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'update-index --ca > + test_all_match git read-tree -mu --no-sparse-checkout $MERGE_TREES && > + test_all_match git status --porcelain=v2 && > + test_cmp sparse-checkout/folder2/a sparse-index/folder2/a && > -+ test_cmp sparse-checkout/folder2/a full-checkout/folder2/a || return 1 > -+ done > -+' > ++ test_cmp sparse-checkout/folder2/a full-checkout/folder2/a > ++ > ++ ' > ++done > + > +test_expect_success 'read-tree --merge with edit/edit conflicts in sparse directories' ' > + init_repos && > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'update-index --ca > + test_all_match test_must_fail git read-tree -mu rename-out-to-in update-folder1 > +' > + > -+test_expect_success 'read-tree --prefix outside sparse definition' ' > ++test_expect_success 'read-tree --prefix' ' > + init_repos && > + > -+ # Cannot read-tree --prefix with a single argument when files exist within > -+ # prefix > -+ test_all_match test_must_fail git read-tree --prefix=folder1/ -u update-folder1 && > ++ # If files differing between the index and target <commit-ish> exist > ++ # inside the prefix, `read-tree --prefix` should fail > ++ test_all_match test_must_fail git read-tree --prefix=deep/ deepest && > ++ test_all_match test_must_fail git read-tree --prefix=folder1/ update-folder1 && > + > -+ test_all_match git read-tree --prefix=folder2/0 -u rename-base && > -+ test_path_is_missing sparse-checkout/folder2 && > -+ test_path_is_missing sparse-index/folder2 && > ++ # If no differing index entries exist matching the prefix, > ++ # `read-tree --prefix` updates the index successfully > ++ test_all_match git rm -rf deep/deeper1/deepest/ && > ++ test_all_match git read-tree --prefix=deep/deeper1/deepest -u deepest && > ++ test_all_match git status --porcelain=v2 && > ++ > ++ test_all_match git rm -rf --sparse folder1/ && > ++ test_all_match git read-tree --prefix=folder1/ -u update-folder1 && > ++ test_all_match git status --porcelain=v2 && > + > -+ test_all_match git read-tree --reset -u HEAD && > -+ test_all_match git read-tree --prefix=folder2/0 -u --no-sparse-checkout rename-base && > -+ test_cmp sparse-checkout/folder2/0/a sparse-index/folder2/0/a && > -+ test_cmp sparse-checkout/folder2/0/a full-checkout/folder2/0/a > ++ test_all_match git rm -rf --sparse folder2/0 && > ++ test_all_match git read-tree --prefix=folder2/0/ -u rename-out-to-out && > ++ test_all_match git status --porcelain=v2 > +' > + > +test_expect_success 'read-tree --merge with directory-file conflicts' ' > 4: cb7e0cf419c ! 5: 015618a9f29 read-tree: integrate with sparse index > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n > + for MERGE_TREES in "update-folder2" > + do > + ensure_not_expanded read-tree -mu $MERGE_TREES && > -+ ensure_not_expanded reset --hard HEAD || return 1 > ++ ensure_not_expanded reset --hard || return 1 > + done > +' > + > 5: 4f05fa70209 ! 6: 1a9365c3bc5 read-tree: narrow scope of index expansion for '--prefix' > @@ Commit message > If the prefix is in-cone, its sparse subdirectories (if any) will be > traversed correctly without index expansion. > > + The behavior of 'git read-tree' with prefixes 1) inside of cone, 2) equal to > + a sparse directory, and 3) inside a sparse directory are all tested as part > + of the 't/t1092-sparse-checkout-compatibility.sh' test 'read-tree --prefix', > + ensuring that the sparse index case works the way it did prior to this > + change as well as matching non-sparse index sparse-checkout. > + > + Helped-by: Elijah Newren <newren@xxxxxxxxx> > Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> > > ## builtin/read-tree.c ## > @@ t/t1092-sparse-checkout-compatibility.sh > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: read-tree' ' > do > ensure_not_expanded read-tree -mu $MERGE_TREES && > - ensure_not_expanded reset --hard HEAD || return 1 > + ensure_not_expanded reset --hard || return 1 > - done > + done && > + > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n > test_expect_success 'ls-files' ' > > ## unpack-trees.c ## > -@@ unpack-trees.c: int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options > - setup_standard_excludes(o->dir); > - } > +@@ unpack-trees.c: static void populate_from_existing_patterns(struct unpack_trees_options *o, > + o->pl = pl; > + } > > -+ /* > -+ * If the prefix is equal to or contained within a sparse directory, the > -+ * index needs to be expanded to traverse with the specified prefix. > -+ */ > -+ if (o->prefix && o->src_index->sparse_index) { > -+ int prefix_len = strlen(o->prefix); > ++static void update_sparsity_for_prefix(const char *prefix, > ++ struct index_state *istate) > ++{ > ++ int prefix_len = strlen(prefix); > ++ struct strbuf ce_prefix = STRBUF_INIT; > + > -+ while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/') > -+ prefix_len--; > ++ if (!istate->sparse_index) > ++ return; > + > -+ if (prefix_len > 0) { > -+ struct strbuf ce_prefix = STRBUF_INIT; > -+ strbuf_grow(&ce_prefix, prefix_len + 1); > -+ strbuf_add(&ce_prefix, o->prefix, prefix_len); > -+ strbuf_addch(&ce_prefix, '/'); > ++ while (prefix_len > 0 && prefix[prefix_len - 1] == '/') > ++ prefix_len--; > + > -+ /* > -+ * If the prefix is not inside the sparse cone, then the > -+ * index is explicitly expanded if it is found as a sparse > -+ * directory, or implicitly expanded (by 'index_name_pos') > -+ * if the path is inside a sparse directory. > -+ */ > -+ if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) && > -+ index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0) > -+ ensure_full_index(o->src_index); > ++ if (prefix_len <= 0) > ++ BUG("Invalid prefix passed to update_sparsity_for_prefix"); > + > -+ strbuf_release(&ce_prefix); > -+ } > -+ } > ++ strbuf_grow(&ce_prefix, prefix_len + 1); > ++ strbuf_add(&ce_prefix, prefix, prefix_len); > ++ strbuf_addch(&ce_prefix, '/'); > ++ > ++ /* > ++ * If the prefix points to a sparse directory or a path inside a sparse > ++ * directory, the index should be expanded. This is accomplished in one > ++ * of two ways: > ++ * - if the prefix is inside a sparse directory, it will be expanded by > ++ * the 'ensure_full_index(...)' call in 'index_name_pos(...)'. > ++ * - if the prefix matches an existing sparse directory entry, > ++ * 'index_name_pos(...)' will return its index position, triggering > ++ * the 'ensure_full_index(...)' below. > ++ */ > ++ if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, istate) && > ++ index_name_pos(istate, ce_prefix.buf, ce_prefix.len) >= 0) > ++ ensure_full_index(istate); > ++ > ++ strbuf_release(&ce_prefix); > ++} > + > + static int verify_absent(const struct cache_entry *, > + enum unpack_trees_error_types, > +@@ unpack-trees.c: int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options > + setup_standard_excludes(o->dir); > + } > + > ++ if (o->prefix) > ++ update_sparsity_for_prefix(o->prefix, o->src_index); > + > if (!core_apply_sparse_checkout || !o->update) > o->skip_sparse_checkout = 1; > 6: 94c2aad2f93 ! 7: 71bec3ddad1 read-tree: make two-way merge sparse-aware > @@ Commit message > This process allows sparse directories to be individually merged at the > necessary depth *without* expanding a full index. > > + The 't/t1092-sparse-checkout-compatibility.sh' test 'read-tree --merge with > + edit/edit conflicts in sparse directories' tests two-way merges with 1) > + changes inside sparse directories that do not conflict and 2) changes that > + do conflict (with the correct file(s) reported in the error message). > + Additionally, add two-way merge cases to 'sparse index is not expanded: > + read-tree' to confirm that the index is not expanded regardless of whether > + edit/edit conflicts are present in a sparse directory. > + > Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> > > ## builtin/read-tree.c ## > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n > > ensure_not_expanded checkout -b test-branch update-folder1 && > - for MERGE_TREES in "update-folder2" > -+ for MERGE_TREES in "update-folder2" \ > -+ "base update-folder2" > ++ for MERGE_TREES in "base update-folder2" \ > ++ "base rename-base" \ > ++ "update-folder2" > do > ensure_not_expanded read-tree -mu $MERGE_TREES && > - ensure_not_expanded reset --hard HEAD || return 1 > + ensure_not_expanded reset --hard || return 1 > > ## unpack-trees.c ## > @@ unpack-trees.c: static int is_sparse_directory_entry(struct cache_entry *ce, > 7: c4080e99d6e ! 8: 6bc11325dd1 read-tree: make three-way merge sparse-aware > @@ Commit message > merge, the contents of each conflicted sparse directory are merged without > referencing the index, avoiding sparse index expansion. > > + As with two-way merge, the 't/t1092-sparse-checkout-compatibility.sh' test > + 'read-tree --merge with edit/edit conflicts in sparse directories' confirms > + that three-way merges with edit/edit changes (both with and without > + conflicts) inside a sparse directory result in the correct index state or > + error message. To ensure the index is not unnecessarily expanded, add > + three-way merge cases to 'sparse index is not expanded: read-tree'. > + > Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> > > ## builtin/read-tree.c ## > @@ builtin/read-tree.c: int cmd_read_tree(int argc, const char **argv, const char * > > ## t/t1092-sparse-checkout-compatibility.sh ## > @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: read-tree' ' > + init_repos && > > ensure_not_expanded checkout -b test-branch update-folder1 && > - for MERGE_TREES in "update-folder2" \ > -- "base update-folder2" > +- for MERGE_TREES in "base update-folder2" \ > ++ for MERGE_TREES in "base HEAD update-folder2" \ > ++ "base HEAD rename-base" \ > + "base update-folder2" \ > -+ "base HEAD update-folder2" > + "base rename-base" \ > + "update-folder2" > do > - ensure_not_expanded read-tree -mu $MERGE_TREES && > - ensure_not_expanded reset --hard HEAD || return 1 > > ## unpack-trees.c ## > @@ unpack-trees.c: int threeway_merge(const struct cache_entry * const *stages, > @@ unpack-trees.c: int threeway_merge(const struct cache_entry * const *stages, > > if (head) { > /* #5ALT, #15 */ > +@@ unpack-trees.c: int threeway_merge(const struct cache_entry * const *stages, > + > + } > + > +- /* Below are "no merge" cases, which require that the index be > +- * up-to-date to avoid the files getting overwritten with > +- * conflict resolution files. > +- */ > ++ /* Handle "no merge" cases (see t/t1000-read-tree-m-3way.sh) */ > + if (index) { > ++ /* > ++ * If we've reached the "no merge" cases and we're merging > ++ * a sparse directory, we may have an "edit/edit" conflict that > ++ * can be resolved by individually merging directory contents. > ++ */ > ++ if (S_ISSPARSEDIR(index->ce_mode)) > ++ return merged_sparse_dir(stages, 4, o); > ++ > ++ /* > ++ * If we're not merging a sparse directory, ensure the index is > ++ * up-to-date to avoid files getting overwritten with conflict > ++ * resolution files > ++ */ > + if (verify_uptodate(index, o)) > + return -1; > + } > > -- > gitgitgadget