Re: [PATCH v4] status: long status advice adapted to recent capabilities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 9, 2022 at 11:46 PM Rudy Rigot via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
> Improve the advice displayed when `git status` is slow because
> of excessive numbers of untracked files.  Update the `git status`
> man page to explain the various configuration options.
>
> `git status` can be slow when there are a large number of untracked
> files and directories, because Git must search the entire worktree
> to enumerate them.  Previously, Git would print an advice message
> with the elapsed search time and a suggestion to disable the search
> using the `-uno` option.  This suggestion also carried a warning
> that might scare off some users.
>
> Git can reduce the size and time of the untracked file search when
> the `core.untrackedCache` and `core.fsmonitor` features are enabled
> by caching results from previous `git status` invocations.
>
> Update the advice to explain the various combinations of additional
> configuration options and refer to (new) documentation in the man
> page that explains it in more detail than what can be printed in an
> advice message.
>
> Finally, add new tests to verify the new functionality.
>
> Signed-off-by: Rudy Rigot <rudy.rigot@xxxxxxxxx>
> ---
> diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
> @@ -457,6 +457,61 @@ during the write may conflict with other simultaneous processes, causing
> +UNTRACKED FILES AND STATUS SPEED
> +--------------------------------
> +
> +`git status` can be very slow in large worktrees if/when it
> +needs to search for untracked files and directories.  There are
> +many configuration options available to speed this up by either
> +avoiding the work or making use of cached results from previous
> +Git commands.  Since we all work in different ways, there is no
> +single optimum set of settings right for everyone.  Here is a

Not necessarily worth a reroll, but the "since we all work..."
fragment could be dropped without any loss of clarity:

    There is no single optimum configuration suitable
    to every situation.

would probably be sufficient.

> +brief summary of the relevant options to help you choose which
> +is right for you.  Each of these settings is independently
> +documented elsewhere in more detail, so please refer to them
> +for complete details.

This leaves the reader hanging somewhat by not saying where
"elsewhere" actually is. You can use gitlink: to insert links to the
appropriate documentation.

> +* The `-uno` flag or the `status.showUntrackedfiles=false`
> +    config : indicate that `git status` should not report untracked
> +       files. This is the fastest option. `git status` will not list
> +       the untracked files, so you need to be careful to remember if
> +       you create any new files and manually `git add` them.

This and all the other bullet points use an odd mix of spaces and TAB
for indentation. They should consistently use one or the other.

Earlier in this thread, Ævar suggested that it might be worthwhile to
mention an additional possibility: that simply rerunning `git status`
might be sufficient to achieve reasonable speed since the second run
of `git status` may benefit from the filesystem cache having been
primed by the first invocation of `git status`. As such, it may be
overkill for a user to dive into the various options described here.
Hence, to mitigate the possibility of a user doing a lot of research
and extra unnecessary configuration, it might make sense for the very
first bullet point (before this one) to say merely:

    * Do nothing. Subsequent invocations of `git status` may be faster
      simply because the first `git status` primed the filesystem cache.

or something like that (probably more formal, though).

> diff --git a/t/t7065-wtstatus-slow.sh b/t/t7065-wtstatus-slow.sh
> @@ -0,0 +1,74 @@
> +#!/bin/sh
> +
> +test_description='test status when slow untracked files'
> +
> +. ./test-lib.sh
> +
> +DATA="$TEST_DIRECTORY/t7065"

This variable is unused, isn't it, now that you're inlining everything
with here-docs?

> +GIT_TEST_UF_DELAY_WARNING=1
> +export GIT_TEST_UF_DELAY_WARNING
> +
> +test_expect_success setup '
> +       git checkout -b test
> +'
> +
> +test_expect_success 'when core.untrackedCache and fsmonitor are unset' '
> +       test_must_fail git config --get core.untrackedCache &&
> +       test_must_fail git config --get core.fsmonitor &&

Rather than asserting that some preceding code has left the
configuration in a state this test wants, it would be more robust for
this test to forcibly set up the state it wants. Something like this
should work:

    test_might_fail git config ---unset-all core.untrackedCache &&
    test_might_fail git config ---unset-all core.fsmonitor &&

> +       git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&

At all costs, avoid escaping the "trash" directory in which this test
is running. All files created by this test should be within the
"trash" directory hierarchy. Therefore, drop "../" from all the
filenames, and instead create these files directly in the "trash"
directory or in some subdirectory of the "trash" directory.

I presume the reason you're escaping the "trash" directory is because
you don't want these untracked "actual" and "expected" files to
pollute the `git status` output you're testing? If so, then you should
probably instead set up a .gitignore file in the "trash" directory to
make `git status` ignore them.

We usually want to avoid having a Git command upstream of a pipe since
the exit code of the Git command will be lost. So, we'd typically do
this instead:

    git status >out &&
    sed "s/[0-9]\.[0-9][0-9]/X/g" out >actual &&

> +       cat >../expected <<-EOF &&
> +On branch test
> +
> +No commits yet
> +
> +
> +It took X seconds to enumerate untracked files.
> +See '"'"'git help status'"'"' for information on how to improve this.
> +
> +nothing to commit (create/copy files and use "git add" to track)
> +       EOF

Two comments. First, use <<-\EOF rather than <<-EOF to make it clear
to readers that you're not interpolating any variables in the here-doc
body. Second, the <<- operator allows you to indent the here-doc body
(with TABs, not spaces), so you can align the body with the rest of
the code:

    cat >expected <<-\EOF &&
    On branch test
    ...
    EOF

> +       test_cmp ../expected ../actual &&
> +       rm -fr ../actual ../expected
> +'

We usually don't bother cleaning up these files if they don't impact
subsequent tests negatively. However, note that if any code above the
`rm -fr` command fails, then the `rm -fr` command itself won't be run,
hence the files won't get cleaned up upon test failure. To ensure
cleanup (if that's what you desire), use test_when_finished() _before_
code which may fail. So, something like this is typical:

    test_expect_success 'when core.untrackedCache and fsmonitor are unset' '
        test_when_finished "rm -fr actual expected" &&
        test_might_fail git config ---unset-all core.untrackedCache &&
        test_might_fail git config ---unset-all core.fsmonitor &&
        ...
    '

Same comments apply to the other new tests added by this patch.

> diff --git a/wt-status.c b/wt-status.c
> @@ -1205,6 +1207,17 @@ static void wt_longstatus_print_tracking(struct wt_status *s)
> +static inline int uf_was_slow(uint32_t untracked_in_ms)
> +{

Does this need to be inline? Is this a hot piece of code, or is this
merely a premature optimization?

> +       const char *x;
> +       x = getenv("GIT_TEST_UF_DELAY_WARNING");
> +       if (x) {
> +               untracked_in_ms += UF_DELAY_WARNING_IN_MS + 1;
> +       }

Style is to avoid { } braces for these one-liner bodies.

I think we don't even need the variable "x" in this case, so:

    if (getenv("GIT_TEST_UF_DELAY_WARNING"))
        untracked_in_ms += UF_DELAY_WARNING_IN_MS + 1;

would be cleaner.

> +       return UF_DELAY_WARNING_IN_MS < untracked_in_ms;
> +}



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux