On Tue, Aug 17, 2021 at 6:23 AM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > UPDATE: I had to base this version on a merge of ds/add-with-sparse-index > and v2.33.0-rc1 because of some changes to dir.h that became important! > > We launched an experimental release [1] of the sparse-index feature to our > internal users. We immediately discovered a problem due to the isolated way > in which we tested the sparse index: we were never building the project and > changing our sparse-checkout definition. > > [1] https://github.com/microsoft/git/releases/tag/v2.32.0.vfs.0.102.exp > > Users who ran a build in one project and then moved to another still had > build artifacts in their worktree that lived inside the old directories. > Since the files are marked by the .gitignore patterns, these files were not > removed by the 'git sparse-checkout set' command. However, they make the > sparse-index unusable because every 'git status' command needs to expand the > sparse-directory entries in order to see if the files are tracked or not. > This made the first experimental release actually slower for all users > because of this cost. > > The solution we shipped to these customers was to change the way our fork > handles these ignored files. Specifically, instead of Git completely > ignoring the files, we changed Git to understand that with cone-mode > sparse-checkout patterns, the users is asking for entire directories to be > removed from the worktree. The link [1] included earlier has this change. > > I believe that this is a reasonable expectation, though I recognize that it > might look like breaking the expectations of how .gitignore files work. > > Since feedback demonstrated that this is a desired behavior, v2 includes > this behavior for all "cone mode" repositories. > > I'm interested in the community's thoughts about this change, as it seems > like one that we should make carefully and intentionally. > > While the rewrite of the t7519 test seems unrelated, it is required to avoid > a test failure with this change that deletes files outside of the cone. By > moving the test into repositories not at $TRASH_DIRECTORY, we gain more > control over the repository structure. > > > Update in V3 > ============ > > * As promised [2], the helper methods are fixed to work with non-cone-mode > patterns. A later series will use them to their fullest potential > (changing git add, git rm, and git mv when interacting with sparse > entries). Sorry I didn't review V2. I'll find some time in the next few days to review V3. > > [2] > https://lore.kernel.org/git/bac76c72-955d-1ade-4ecf-778ffc45f297@xxxxxxxxx/ > > > Updates in V2 > ============= > > * This version correctly leaves untracked files alone. If untracked files > are found, then the directory is left as-is, in case those ignored files > are important to the user's work resolving those untracked files. > > * This behavior is now enabled by core.sparseCheckoutCone=true. > > * To use a sparse index as an in-memory data structure even when > index.sparse is disabled, a new patch is included to modify the prototype > of convert_to_sparse() to include a flags parameter. > > * A few cleanup patches that I was collecting based on feedback from the > experimental release and intending for my next series were necessary for > this implementation. > > * Cleaned up the tests (no NEEDSWORK) and the remainders of a previous > implementation that used run_subcommand(). > > Thanks, -Stolee > > Derrick Stolee (8): > t7519: rewrite sparse index test > sparse-index: silently return when not using cone-mode patterns > sparse-index: silently return when cache tree fails > unpack-trees: fix nested sparse-dir search > sparse-checkout: create helper methods > attr: be careful about sparse directories > sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag > sparse-checkout: clear tracked sparse dirs > > Documentation/git-sparse-checkout.txt | 8 +++ > attr.c | 14 ++++ > builtin/add.c | 8 +-- > builtin/sparse-checkout.c | 95 +++++++++++++++++++++++++++ > dir.c | 33 ++++++++++ > dir.h | 6 ++ > read-cache.c | 4 +- > sparse-index.c | 70 +++++++++++--------- > sparse-index.h | 3 +- > t/t1091-sparse-checkout-builtin.sh | 59 +++++++++++++++++ > t/t7519-status-fsmonitor.sh | 38 ++++++----- > unpack-trees.c | 18 +++-- > 12 files changed, 294 insertions(+), 62 deletions(-) > > > base-commit: 80b8d6c56b8a5f5db1d5c2a0159fd808e8a7fc4f > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1009%2Fderrickstolee%2Fsparse-index%2Fignored-files-v3 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1009/derrickstolee/sparse-index/ignored-files-v3 > Pull-Request: https://github.com/gitgitgadget/git/pull/1009 > > Range-diff vs v2: > > 1: e66106f7a99 = 1: e66106f7a99 t7519: rewrite sparse index test > 2: fb3ff9108bf = 2: fb3ff9108bf sparse-index: silently return when not using cone-mode patterns > 3: 37198535268 = 3: 37198535268 sparse-index: silently return when cache tree fails > 4: 10bcadb284e = 4: 10bcadb284e unpack-trees: fix nested sparse-dir search > 5: e9ed5cd2830 ! 5: 5d28570c82a sparse-checkout: create helper methods > @@ dir.c: done: > > +int init_sparse_checkout_patterns(struct index_state *istate) > +{ > -+ if (istate->sparse_checkout_patterns) > ++ if (!core_apply_sparse_checkout || > ++ istate->sparse_checkout_patterns) > + return 0; > + > + CALLOC_ARRAY(istate->sparse_checkout_patterns, 1); > @@ dir.c: done: > + return 0; > +} > + > -+enum pattern_match_result path_in_sparse_checkout(const char *path, > -+ struct index_state *istate) > ++int path_in_sparse_checkout(const char *path, > ++ struct index_state *istate) > +{ > ++ const char *base; > + int dtype = DT_REG; > + init_sparse_checkout_patterns(istate); > + > + if (!istate->sparse_checkout_patterns) > + return MATCHED; > + > -+ return path_matches_pattern_list(path, strlen(path), NULL, &dtype, > ++ base = strrchr(path, '/'); > ++ return path_matches_pattern_list(path, strlen(path), base ? base + 1 : path, > ++ &dtype, > + istate->sparse_checkout_patterns, > -+ istate); > ++ istate) > 0; > +} > + > static struct path_pattern *last_matching_pattern_from_lists( > @@ dir.h: enum pattern_match_result path_matches_pattern_list(const char *pathname, > + > +int init_sparse_checkout_patterns(struct index_state *state); > + > -+enum pattern_match_result path_in_sparse_checkout(const char *path, > -+ struct index_state *istate); > ++int path_in_sparse_checkout(const char *path, > ++ struct index_state *istate); > + > struct dir_entry *dir_add_ignored(struct dir_struct *dir, > struct index_state *istate, > 6: 5680df62e1c = 6: c9e100e68f8 attr: be careful about sparse directories > 7: 1dd73b36eb4 = 7: b0ece4b7dcc sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag > 8: f74015a2be9 = 8: febef675f05 sparse-checkout: clear tracked sparse dirs > > -- > gitgitgadget