While doing some investigation in a private monorepo with sparse-checkout and a sparse index, I accidentally left a modified file outside of my sparse-checkout cone. This caused my Git commands to slow to a crawl, so I reran with GIT_TRACE2_PERF=1. While I was able to identify clear_skip_worktree_from_present_files() as the culprit, it took longer than desired to figure out what was going on. This series intends to both fix the performance issue (as much as possible) and do some refactoring to make it easier to understand what is happening. In the end, I was able to reduce the number of lstat() calls in my case from over 1.1 million to about 4,400, improving the time from 13.4s to 81ms on a warm disk cache. (These numbers are from a test after v2, which somehow hit the old caching algorithm even worse than my test in v1.) Updates in v3 ============= * Removed the incorrect paragraph in the commit message of patch 1. * Replaced "largest" with "longest" in the final patch. Thanks, Stolee Derrick Stolee (5): sparse-checkout: refactor skip worktree retry logic sparse-index: refactor path_found() sparse-index: use strbuf in path_found() sparse-index: count lstat() calls sparse-index: improve lstat caching of sparse paths sparse-index.c | 216 +++++++++++++++++++++++++++++++++++++------------ 1 file changed, 164 insertions(+), 52 deletions(-) base-commit: 66ac6e4bcd111be3fa9c2a6b3fafea718d00678d Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1754%2Fderrickstolee%2Fclear-skip-speed-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1754/derrickstolee/clear-skip-speed-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1754 Range-diff vs v2: 1: 93d0baed0b0 ! 1: 0844cda94cf sparse-checkout: refactor skip worktree retry logic @@ Commit message stored in the index, so caching was introduced in d79d299352 (Accelerate clear_skip_worktree_from_present_files() by caching, 2022-01-14). - If users are having trouble with the performance of this operation and - don't care about paths outside of the sparse-checkout, they can disable - them using the sparse.expectFilesOutsideOfPatterns config option - introduced in ecc7c8841d (repo_read_index: add config to expect files - outside sparse patterns, 2022-02-25). - This check is particularly confusing in the presence of a sparse index, as a sparse tree entry corresponding to an existing directory must first be expanded to a full index before examining the paths within. This is 2: 69c3beaabf7 = 2: c242e2c9168 sparse-index: refactor path_found() 3: 0a82e6b4183 = 3: ad63bf746ca sparse-index: use strbuf in path_found() 4: 9549f5b8062 = 4: db6ded0df0d sparse-index: count lstat() calls 5: 0cb344ac14f ! 5: 1f58e19691f sparse-index: improve lstat caching of sparse paths @@ sparse-index.c: static void clear_path_found_data(struct path_found_data *data) } +/** -+ * Return the length of the largest common substring that ends in a -+ * slash ('/') to indicate the largest common parent directory. Returns ++ * Return the length of the longest common substring that ends in a ++ * slash ('/') to indicate the longest common parent directory. Returns + * zero if no common directory exists. + */ +static size_t max_common_dir_prefix(const char *path1, const char *path2) -- gitgitgadget