Like previous integrations [1] [2], this series allows 'git read-tree' to operate using a sparse index. The first three patches are bugfixes for issues found while implementing the 'read-tree' integration: * The first (patch [1/8]) fixes an edge case in which a repo with no in-cone files or directories would have its root collapsed into a sparse directory; the fix ensures the root directory is never collapsed into a sparse directory. * The second (patch [2/8]) corrects the 'git status' reporting of changes nested inside the subdirectory of a sparse directory, ensuring that the modified file (not the subdirectory) is correctly reported as having changes. * The third (patch [3/8]) explicitly disallows the somewhat ambiguous case of using 'git read-tree --prefix=/' to represent the repo root (it could be interpreted as a leading slash - even though it's actually trailing slash - which is not allowed for prefixes). Now, an error specifying that prefixes cannot begin with '/' guides users more directly to the correct syntax. If a user does want to specify the repo root as their prefix, that can still be done with 'git read-tree --prefix=' The remainder of the series focuses on utilizing the sparse index in 'git read-tree'. After some baseline behavior-establishing tests (patch [4/8]), sparse index usage is trivially enabled (patch [5/8]) for 'read-tree' except: * usage with '--prefix' * two- and three-way merge These cases require additional changes in order to work as expected (i.e., outwardly matching non-sparse index sparse-checkout). For the former, the sparse index can be enabled as long as the index is expanded when the prefix is a directory outside the sparse cone (patch [6/8]). For the latter, sparse directories that cannot be trivially merged must have their contents merged file-by-file, done by recursively traversing the trees represented by the sparse directories (patches [7/8] & [8/8]). Changes since V2 ================ * revised/clarified explanation for why the 'recursive' flag is necessary in 'git status' internal diff in both the commit message for [2/8] and the code comments * added patch [3/8] to disallow '/' as a prefix * moved the loop over different 'read-tree' scenarios outside the test in t1092 test 'read-tree --merge with files outside sparse definition', separating the tests into individually-runnable and traceable cases * improved test case clarity and variety in 'read-tree --prefix' tests * removed redundant arg to 'git reset' in 'sparse index is not expanded: read-tree' * pointed out pre-existing testing that covers '--prefix' in [6/8] commit message * moved prefix-dependent index expansion logic into function 'update_sparsity_for_prefix', improved explanation for function behavior in comments * pointed out pre-existing testing that covers two- and three-way merge in [7/8] & [8/8] commit messages, respectively * added test cases for "ensure not expanded" tests of read-tree for both two- and three-way merge * fixed issue with three-way merge implementation (if the "no merge" cases described in 't/t1000-read-tree-m-3way.sh' are reached, sparse directories should be recursively merged), testing added in the "ensure not expanded" cases Changes since V1 ================ * switched an empty string check from '!strlen(path)' to the slightly-less-expensive '!*path' Thanks! * Victoria [1] https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@xxxxxxxxx/ [2] https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@xxxxxxxxx/ Victoria Dye (8): sparse-index: prevent repo root from becoming sparse status: fix nested sparse directory diff in sparse index read-tree: explicitly disallow prefixes with a leading '/' read-tree: expand sparse checkout test coverage read-tree: integrate with sparse index read-tree: narrow scope of index expansion for '--prefix' read-tree: make two-way merge sparse-aware read-tree: make three-way merge sparse-aware builtin/read-tree.c | 14 ++- dir.c | 7 +- t/perf/p2000-sparse-operations.sh | 1 + t/t1003-read-tree-prefix.sh | 10 ++ t/t1092-sparse-checkout-compatibility.sh | 133 ++++++++++++++++++++ unpack-trees.c | 147 +++++++++++++++++++++-- wt-status.c | 9 ++ 7 files changed, 308 insertions(+), 13 deletions(-) base-commit: e6ebfd0e8cbbd10878070c8a356b5ad1b3ca464e Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1157%2Fvdye%2Fsparse%2Fread-tree-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1157/vdye/sparse/read-tree-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1157 Range-diff vs v2: 1: 744668eeece = 1: 744668eeece sparse-index: prevent repo root from becoming sparse 2: f0cff03b95d ! 2: 26f4d30bd95 status: fix nested sparse directory diff in sparse index @@ Metadata ## Commit message ## status: fix nested sparse directory diff in sparse index - Add the 'recursive' flag to 'wt_status_collect_changes_index(...)'. Without - the 'recursive' flag, 'git status' could report index changes incorrectly - when the following conditions were met: + Enable the 'recursive' diff option for the diff executed as part of 'git + status'. Without the 'recursive' enabled, 'git status' reports index + changes incorrectly when the following conditions were met: * sparse index is enabled * there is a difference between the index and HEAD in a file inside a *subdirectory* of a sparse directory * the sparse directory index entry is *not* expanded in-core - In this scenario, 'git status' would not recurse into the sparse directory's - subdirectories to identify which file contained the difference between the - index and HEAD. Instead, it would report the immediate subdirectory itself - as "modified". + Because it is not recursive by default, the diff in 'git status' reports + changes only at the level of files and directories that are immediate + children of a sparse directory, rather than recursing into directories with + changes to identify the modified file(s). As a result, 'git status' reports + the immediate subdirectory itself as "modified". Example: @@ Commit message (use "git restore --staged <file>..." to unstage) modified: sparse/sub - The 'recursive' diff option in 'wt_status_collect_changes_index' corrects - this by indicating that 'git status' should recurse into sparse directories - to find modified files. Given the same repository setup as the example - above, the corrected result of `git status` is: + Enabling the 'recursive' diff option in 'wt_status_collect_changes_index' + corrects this issue by allowing the diff to recurse into subdirectories of + sparse directories to find modified files. Given the same repository setup + as the example above, the corrected result of `git status` is: $ git status On branch master @@ wt-status.c: static void wt_status_collect_changes_index(struct wt_status *s) rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score; + + /* -+ * The `recursive` option must be enabled to show differences in files -+ * *more than* one level deep in a sparse directory index entry (e.g., given -+ * sparse directory 'sparse-dir/', reporting a difference in the file -+ * 'sparse-dir/another-dir/my-file'). ++ * The `recursive` option must be enabled to allow the diff to recurse ++ * into subdirectories of sparse directory index entries. If it is not ++ * enabled, a subdirectory containing file(s) with changes is reported ++ * as "modified", rather than the modified files themselves. + */ + rev.diffopt.flags.recursive = 1; + -: ----------- > 3: e48a281a4d3 read-tree: explicitly disallow prefixes with a leading '/' 3: ffe0b6aff2b ! 4: 90ebcb7b8ff read-tree: expand sparse checkout test coverage @@ Commit message emphasis is placed on interactions with files outside the sparse cone, e.g. merges with out-of-cone conflicts. + Helped-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> ## t/perf/p2000-sparse-operations.sh ## @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'update-index --ca test_cmp expect sparse-checkout-out ' -+test_expect_success 'read-tree --merge with files outside sparse definition' ' -+ init_repos && -+ -+ test_all_match git checkout -b test-branch update-folder1 && -+ for MERGE_TREES in "base HEAD update-folder2" \ -+ "update-folder1 update-folder2" \ -+ "update-folder2" -+ do -+ # Clean up and remove on-disk files -+ test_all_match git reset --hard HEAD && -+ test_sparse_match git sparse-checkout reapply && ++for MERGE_TREES in "base HEAD update-folder2" \ ++ "update-folder1 update-folder2" \ ++ "update-folder2" ++do ++ test_expect_success "'read-tree -mu $MERGE_TREES' with files outside sparse definition" ' ++ init_repos && + + # Although the index matches, without --no-sparse-checkout, outside-of- + # definition files will not exist on disk for sparse checkouts @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'update-index --ca + test_all_match git read-tree -mu --no-sparse-checkout $MERGE_TREES && + test_all_match git status --porcelain=v2 && + test_cmp sparse-checkout/folder2/a sparse-index/folder2/a && -+ test_cmp sparse-checkout/folder2/a full-checkout/folder2/a || return 1 -+ done -+' ++ test_cmp sparse-checkout/folder2/a full-checkout/folder2/a ++ ++ ' ++done + +test_expect_success 'read-tree --merge with edit/edit conflicts in sparse directories' ' + init_repos && @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'update-index --ca + test_all_match test_must_fail git read-tree -mu rename-out-to-in update-folder1 +' + -+test_expect_success 'read-tree --prefix outside sparse definition' ' ++test_expect_success 'read-tree --prefix' ' + init_repos && + -+ # Cannot read-tree --prefix with a single argument when files exist within -+ # prefix -+ test_all_match test_must_fail git read-tree --prefix=folder1/ -u update-folder1 && ++ # If files differing between the index and target <commit-ish> exist ++ # inside the prefix, `read-tree --prefix` should fail ++ test_all_match test_must_fail git read-tree --prefix=deep/ deepest && ++ test_all_match test_must_fail git read-tree --prefix=folder1/ update-folder1 && + -+ test_all_match git read-tree --prefix=folder2/0 -u rename-base && -+ test_path_is_missing sparse-checkout/folder2 && -+ test_path_is_missing sparse-index/folder2 && ++ # If no differing index entries exist matching the prefix, ++ # `read-tree --prefix` updates the index successfully ++ test_all_match git rm -rf deep/deeper1/deepest/ && ++ test_all_match git read-tree --prefix=deep/deeper1/deepest -u deepest && ++ test_all_match git status --porcelain=v2 && ++ ++ test_all_match git rm -rf --sparse folder1/ && ++ test_all_match git read-tree --prefix=folder1/ -u update-folder1 && ++ test_all_match git status --porcelain=v2 && + -+ test_all_match git read-tree --reset -u HEAD && -+ test_all_match git read-tree --prefix=folder2/0 -u --no-sparse-checkout rename-base && -+ test_cmp sparse-checkout/folder2/0/a sparse-index/folder2/0/a && -+ test_cmp sparse-checkout/folder2/0/a full-checkout/folder2/0/a ++ test_all_match git rm -rf --sparse folder2/0 && ++ test_all_match git read-tree --prefix=folder2/0/ -u rename-out-to-out && ++ test_all_match git status --porcelain=v2 +' + +test_expect_success 'read-tree --merge with directory-file conflicts' ' 4: cb7e0cf419c ! 5: 015618a9f29 read-tree: integrate with sparse index @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n + for MERGE_TREES in "update-folder2" + do + ensure_not_expanded read-tree -mu $MERGE_TREES && -+ ensure_not_expanded reset --hard HEAD || return 1 ++ ensure_not_expanded reset --hard || return 1 + done +' + 5: 4f05fa70209 ! 6: 1a9365c3bc5 read-tree: narrow scope of index expansion for '--prefix' @@ Commit message If the prefix is in-cone, its sparse subdirectories (if any) will be traversed correctly without index expansion. + The behavior of 'git read-tree' with prefixes 1) inside of cone, 2) equal to + a sparse directory, and 3) inside a sparse directory are all tested as part + of the 't/t1092-sparse-checkout-compatibility.sh' test 'read-tree --prefix', + ensuring that the sparse index case works the way it did prior to this + change as well as matching non-sparse index sparse-checkout. + + Helped-by: Elijah Newren <newren@xxxxxxxxx> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> ## builtin/read-tree.c ## @@ t/t1092-sparse-checkout-compatibility.sh @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: read-tree' ' do ensure_not_expanded read-tree -mu $MERGE_TREES && - ensure_not_expanded reset --hard HEAD || return 1 + ensure_not_expanded reset --hard || return 1 - done + done && + @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n test_expect_success 'ls-files' ' ## unpack-trees.c ## -@@ unpack-trees.c: int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options - setup_standard_excludes(o->dir); - } +@@ unpack-trees.c: static void populate_from_existing_patterns(struct unpack_trees_options *o, + o->pl = pl; + } -+ /* -+ * If the prefix is equal to or contained within a sparse directory, the -+ * index needs to be expanded to traverse with the specified prefix. -+ */ -+ if (o->prefix && o->src_index->sparse_index) { -+ int prefix_len = strlen(o->prefix); ++static void update_sparsity_for_prefix(const char *prefix, ++ struct index_state *istate) ++{ ++ int prefix_len = strlen(prefix); ++ struct strbuf ce_prefix = STRBUF_INIT; + -+ while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/') -+ prefix_len--; ++ if (!istate->sparse_index) ++ return; + -+ if (prefix_len > 0) { -+ struct strbuf ce_prefix = STRBUF_INIT; -+ strbuf_grow(&ce_prefix, prefix_len + 1); -+ strbuf_add(&ce_prefix, o->prefix, prefix_len); -+ strbuf_addch(&ce_prefix, '/'); ++ while (prefix_len > 0 && prefix[prefix_len - 1] == '/') ++ prefix_len--; + -+ /* -+ * If the prefix is not inside the sparse cone, then the -+ * index is explicitly expanded if it is found as a sparse -+ * directory, or implicitly expanded (by 'index_name_pos') -+ * if the path is inside a sparse directory. -+ */ -+ if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) && -+ index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0) -+ ensure_full_index(o->src_index); ++ if (prefix_len <= 0) ++ BUG("Invalid prefix passed to update_sparsity_for_prefix"); + -+ strbuf_release(&ce_prefix); -+ } -+ } ++ strbuf_grow(&ce_prefix, prefix_len + 1); ++ strbuf_add(&ce_prefix, prefix, prefix_len); ++ strbuf_addch(&ce_prefix, '/'); ++ ++ /* ++ * If the prefix points to a sparse directory or a path inside a sparse ++ * directory, the index should be expanded. This is accomplished in one ++ * of two ways: ++ * - if the prefix is inside a sparse directory, it will be expanded by ++ * the 'ensure_full_index(...)' call in 'index_name_pos(...)'. ++ * - if the prefix matches an existing sparse directory entry, ++ * 'index_name_pos(...)' will return its index position, triggering ++ * the 'ensure_full_index(...)' below. ++ */ ++ if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, istate) && ++ index_name_pos(istate, ce_prefix.buf, ce_prefix.len) >= 0) ++ ensure_full_index(istate); ++ ++ strbuf_release(&ce_prefix); ++} + + static int verify_absent(const struct cache_entry *, + enum unpack_trees_error_types, +@@ unpack-trees.c: int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options + setup_standard_excludes(o->dir); + } + ++ if (o->prefix) ++ update_sparsity_for_prefix(o->prefix, o->src_index); + if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; 6: 94c2aad2f93 ! 7: 71bec3ddad1 read-tree: make two-way merge sparse-aware @@ Commit message This process allows sparse directories to be individually merged at the necessary depth *without* expanding a full index. + The 't/t1092-sparse-checkout-compatibility.sh' test 'read-tree --merge with + edit/edit conflicts in sparse directories' tests two-way merges with 1) + changes inside sparse directories that do not conflict and 2) changes that + do conflict (with the correct file(s) reported in the error message). + Additionally, add two-way merge cases to 'sparse index is not expanded: + read-tree' to confirm that the index is not expanded regardless of whether + edit/edit conflicts are present in a sparse directory. + Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> ## builtin/read-tree.c ## @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n ensure_not_expanded checkout -b test-branch update-folder1 && - for MERGE_TREES in "update-folder2" -+ for MERGE_TREES in "update-folder2" \ -+ "base update-folder2" ++ for MERGE_TREES in "base update-folder2" \ ++ "base rename-base" \ ++ "update-folder2" do ensure_not_expanded read-tree -mu $MERGE_TREES && - ensure_not_expanded reset --hard HEAD || return 1 + ensure_not_expanded reset --hard || return 1 ## unpack-trees.c ## @@ unpack-trees.c: static int is_sparse_directory_entry(struct cache_entry *ce, 7: c4080e99d6e ! 8: 6bc11325dd1 read-tree: make three-way merge sparse-aware @@ Commit message merge, the contents of each conflicted sparse directory are merged without referencing the index, avoiding sparse index expansion. + As with two-way merge, the 't/t1092-sparse-checkout-compatibility.sh' test + 'read-tree --merge with edit/edit conflicts in sparse directories' confirms + that three-way merges with edit/edit changes (both with and without + conflicts) inside a sparse directory result in the correct index state or + error message. To ensure the index is not unnecessarily expanded, add + three-way merge cases to 'sparse index is not expanded: read-tree'. + Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> ## builtin/read-tree.c ## @@ builtin/read-tree.c: int cmd_read_tree(int argc, const char **argv, const char * ## t/t1092-sparse-checkout-compatibility.sh ## @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: read-tree' ' + init_repos && ensure_not_expanded checkout -b test-branch update-folder1 && - for MERGE_TREES in "update-folder2" \ -- "base update-folder2" +- for MERGE_TREES in "base update-folder2" \ ++ for MERGE_TREES in "base HEAD update-folder2" \ ++ "base HEAD rename-base" \ + "base update-folder2" \ -+ "base HEAD update-folder2" + "base rename-base" \ + "update-folder2" do - ensure_not_expanded read-tree -mu $MERGE_TREES && - ensure_not_expanded reset --hard HEAD || return 1 ## unpack-trees.c ## @@ unpack-trees.c: int threeway_merge(const struct cache_entry * const *stages, @@ unpack-trees.c: int threeway_merge(const struct cache_entry * const *stages, if (head) { /* #5ALT, #15 */ +@@ unpack-trees.c: int threeway_merge(const struct cache_entry * const *stages, + + } + +- /* Below are "no merge" cases, which require that the index be +- * up-to-date to avoid the files getting overwritten with +- * conflict resolution files. +- */ ++ /* Handle "no merge" cases (see t/t1000-read-tree-m-3way.sh) */ + if (index) { ++ /* ++ * If we've reached the "no merge" cases and we're merging ++ * a sparse directory, we may have an "edit/edit" conflict that ++ * can be resolved by individually merging directory contents. ++ */ ++ if (S_ISSPARSEDIR(index->ce_mode)) ++ return merged_sparse_dir(stages, 4, o); ++ ++ /* ++ * If we're not merging a sparse directory, ensure the index is ++ * up-to-date to avoid files getting overwritten with conflict ++ * resolution files ++ */ + if (verify_uptodate(index, o)) + return -1; + } -- gitgitgadget