Integrate `git-grep` with sparse-index and test the performance improvement. Changes since v2 ---------------- * Modify the commit message for "builtin/grep.c: integrate with sparse index" to make it obvious that the perf test results are not from p2000 tests, but from manual perf runs. * Add tree-walking logic as an extra (the third) patch to improve the performance when --sparse is used. This resolved the left-over-bit in v2 [1]. [1] https://lore.kernel.org/git/20220829232843.183711-1-shaoxuan.yuan02@xxxxxxxxx/ Changes since v1 ---------------- * Rewrite the commit message for "builtin/grep.c: add --sparse option" to be clearer. * Update the documentation (both in-code and man page) for --sparse. * Add a few tests to test the new behavior (when _only_ --cached is supplied). * Reformat the perf test results to not look like directly from p2000 tests. * Put the "command_requires_full_index" lines right after parse_options(). * Add a pathspec test in t1092, and reword a few test documentations. Shaoxuan Yuan (3): builtin/grep.c: add --sparse option builtin/grep.c: integrate with sparse index builtin/grep.c: walking tree instead of expanding index with --sparse Documentation/git-grep.txt | 5 ++- builtin/grep.c | 46 +++++++++++++++++++++--- t/perf/p2000-sparse-operations.sh | 1 + t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++ t/t7817-grep-sparse-checkout.sh | 34 ++++++++++++++---- 5 files changed, 92 insertions(+), 12 deletions(-) Range-diff against v2: 1: ab5ff488a1 = 1: db1f5a5409 builtin/grep.c: add --sparse option 2: 68c7ecee73 ! 2: af566c7862 builtin/grep.c: integrate with sparse index @@ Commit message Turn on sparse index and remove ensure_full_index(). - Change it to only expands the index when using --sparse. + Change it to only expand the index when using --sparse. - The p2000 tests demonstrate a ~99.4% execution time reduction for + The p2000 tests do not demonstrate a significant improvement, + because the index read is a small portion of the full process + time, compared to the blob parsing. The times below reflect the + time spent in the "do_read_index" trace region as shown using + GIT_TRACE2_PERF=1. + + The tests demonstrate a ~99.4% execution time reduction for `git grep` using a sparse index. - Test Before After + Test HEAD~ HEAD ----------------------------------------------------------------------------- git grep --cached bogus (full-v3) 0.019 0.018 (-5.2%) git grep --cached bogus (full-v4) 0.017 0.016 (-5.8%) @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix) int fallback = 0; git_config_get_bool("grep.fallbacktonoindex", &fallback); - ## t/perf/p2000-sparse-operations.sh ## -@@ t/perf/p2000-sparse-operations.sh: test_perf_on_all git read-tree -mu HEAD - test_perf_on_all git checkout-index -f --all - test_perf_on_all git update-index --add --remove $SPARSE_CONE/a - test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a" -+test_perf_on_all git grep --cached bogus - - test_done - ## t/t1092-sparse-checkout-compatibility.sh ## @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: rm' ' ensure_not_expanded rm -r deep -: ---------- > 3: 757ac7ddee builtin/grep.c: walking tree instead of expanding index with --sparse base-commit: d42b38dfb5edf1a7fddd9542d722f91038407819 -- 2.37.0