"Anh Le via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Anh Le <anh@xxxxxxxxx> > > In a large repository using sparse checkout, checking > whether a file marked with skip worktree is present > on disk and its skip worktree bit should be cleared > can take a considerable amount of time. Add a trace2 > region to surface this information. > > Signed-off-by: Anh Le <anh@xxxxxxxxx> > --- > index: add trace2 region for clear skip worktree > > In large repository using sparse checkout, checking whether a file > marked with skip worktree is present on disk and its skip worktree bit > should be cleared can take a considerable amount of time. Add a trace2 > region to surface this information. It is easy to see that the change is no-op from functionality's standpoint. The condition under which ce->ce_flags loses the CE_SKIP_WORKTREE bit is the same as before, and the only change is that in the iteration a couple of variables are incremented, which may (or may not) have performance impact, but shouldn't break correctness. I am not sure about the value of these counters, honestly. If we hit a sparse dir just once, we fully flatten the index and go back to the restart state, and it would be a bug if we still see a sparsified directory in the in-core index after that point, so restart_count would be at most one, wouldn't it? Why do we even need to count with intmax_t? Similarly, path_count is bounded by istate->cache_nr. As you know the approximate size of your project, I am not sure what extra information you want to get by counting the paths with the skip bit set before the first sparsified directory in the in-core index twice, and the other paths with the skip bit set just once, and adding these numbers together. Also again, I am not quite sure what the point is to count the paths in intmax_t. > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1368%2FHaizzz%2Fmaster-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1368/Haizzz/master-v1 > Pull-Request: https://github.com/git/git/pull/1368 > > sparse-index.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/sparse-index.c b/sparse-index.c > index e4a54ce1943..d11049c8aeb 100644 > --- a/sparse-index.c > +++ b/sparse-index.c > @@ -493,24 +493,33 @@ void clear_skip_worktree_from_present_files(struct index_state *istate) > int dir_found = 1; > > int i; > + intmax_t path_count = 0; > + intmax_t restart_count = 0; > > if (!core_apply_sparse_checkout || > sparse_expect_files_outside_of_patterns) > return; > > + trace2_region_enter("index", "clear_skip_worktree_from_present_files", istate->repo); > restart: > for (i = 0; i < istate->cache_nr; i++) { > struct cache_entry *ce = istate->cache[i]; > > - if (ce_skip_worktree(ce) && > - path_found(ce->name, &last_dirname, &dir_len, &dir_found)) { > - if (S_ISSPARSEDIR(ce->ce_mode)) { > - ensure_full_index(istate); > - goto restart; > + if (ce_skip_worktree(ce)) { > + path_count++; > + if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) { > + if (S_ISSPARSEDIR(ce->ce_mode)) { > + ensure_full_index(istate); > + restart_count++; > + goto restart; > + } > + ce->ce_flags &= ~CE_SKIP_WORKTREE; > } > - ce->ce_flags &= ~CE_SKIP_WORKTREE; > } > } > + trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/path_count", path_count); > + trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/restart_count", restart_count); > + trace2_region_leave("index", "clear_skip_worktree_from_present_files", istate->repo); > } > > /* > > base-commit: 1fc3c0ad407008c2f71dd9ae1241d8b75f8ef886