Re: [PATCH 05/10] status: use sparse-index throughout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 13, 2021 at 7:01 AM Derrick Stolee via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>
> By testing 'git -c core.fsmonitor= status -uno', we can check for the
> simplest index operations that can be made sparse-aware. The necessary
> implementation details are already integrated with sparse-checkout, so
> modify command_requires_full_index to be zero for cmd_status().
>
> By running the debugger for 'git status -uno' after that change, we find
> two instances of ensure_full_index() that were added for extra safety,
> but can be removed without issue.
>
> In refresh_index(), we loop through the index entries. The
> refresh_cache_ent() method copies the sparse directories into the
> refreshed index without issue.

I do see the removal of a call to ensure_full_index() in
refresh_index() that you mention in this paragraph in the patch below.

I'm confused, though; I would have thought we wanted to avoid a
refresh_cache_ent() call.  Also, one of your previous patches added a

    if (istate->sparse_index && S_ISSPARSEDIR(ce->ce_mode))
        continue;

check before the code ever gets to the refresh_cache_ent() call, so as
far as I can tell, that function won't be called from refresh_entry()
for sparse entries.  Maybe your commit message here is out-of-date?
Or am I confused somehow?

> The loop within run_diff_files() skips things that are in stage 0 and
> have skip-worktree enabled, so seems safe to disable ensure_full_index()
> here.

Unlike the above, I don't see a removal of a ensure_full_index() call
in run_diff_files() as claimed by this paragraph.  Has the commit
message gotten out of date with refactorings you did while developing
this series?

> This allows some cases of 'git status' to no longer expand a sparse
> index to a full one, giving the following performance improvements for
> p2000-sparse-checkout-operations.sh:
>
> Test                                  HEAD~1           HEAD
> -----------------------------------------------------------------------------
> 2000.2: git status (full-index-v3)    0.38(0.36+0.07)  0.37(0.31+0.10) -2.6%
> 2000.3: git status (full-index-v4)    0.38(0.29+0.12)  0.37(0.30+0.11) -2.6%
> 2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  0.04(0.05+0.04) -98.4%
> 2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  0.05(0.04+0.07) -98.0%
>
> Note that since HEAD~1 was expanding the sparse index by parsing trees,
> it was artificially slower than the full index case. Thus, the 98%
> improvement is misleading, and instead we should celebrate the 0.37s to
> 0.05s improvement of 82%. This is more indicative of the peformance
> gains we are expecting by using a sparse index.

82%, very nice.  Was this with git.git as the test repository, or some
other repo?  If it's git.git, then we'd actually expect a much bigger
speedup for other repositories, as git.git is pretty small.


> Note: we are dropping the assignment of core.fsmonitor here. This is not
> necessary for the test script as we are not altering the config any
> other way. Correct integration with FS Monitor will be validated in
> later changes.
>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  builtin/commit.c                         |  3 +++
>  read-cache.c                             |  2 --
>  t/t1092-sparse-checkout-compatibility.sh | 12 ++++++++----
>  3 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index cf0c36d1dcb2..e529da7beadd 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1404,6 +1404,9 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>         if (argc == 2 && !strcmp(argv[1], "-h"))
>                 usage_with_options(builtin_status_usage, builtin_status_options);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +
>         status_init_config(&s, git_status_config);
>         argc = parse_options(argc, argv, prefix,
>                              builtin_status_options,
> diff --git a/read-cache.c b/read-cache.c
> index 6308234b4838..83e6bdef7604 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1578,8 +1578,6 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>          */
>         preload_index(istate, pathspec, 0);
>         trace2_region_enter("index", "refresh", NULL);
> -       /* TODO: audit for interaction with sparse-index. */
> -       ensure_full_index(istate);
>         for (i = 0; i < istate->cache_nr; i++) {
>                 struct cache_entry *ce, *new_entry;
>                 int cache_errno = 0;
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index e488ef9bd941..380a085f8ec4 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -449,12 +449,16 @@ test_expect_success 'sparse-index is expanded and converted back' '
>         GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>                 git -C sparse-index -c core.fsmonitor="" reset --hard &&
>         test_region index convert_to_sparse trace2.txt &&
> -       test_region index ensure_full_index trace2.txt &&
> +       test_region index ensure_full_index trace2.txt
> +'
>
> -       rm trace2.txt &&
> +test_expect_success 'sparse-index is not expanded' '
> +       init_repos &&
> +
> +       rm -f trace2.txt &&
>         GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> -               git -C sparse-index -c core.fsmonitor="" status -uno &&
> -       test_region index ensure_full_index trace2.txt
> +               git -C sparse-index status -uno &&
> +       test_region ! index ensure_full_index trace2.txt
>  '
>
>  test_done
> --
> gitgitgadget

Other than what looks like a couple issues in the commit message, the
change looks good to me.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux