Re: [PATCH v5 2/2] untracked-cache: support '--untracked-files=all' if configured

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Tao Klerks via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: Tao Klerks <tao@xxxxxxxxxx>
>
> Untracked cache was originally designed to only work with
> '--untracked-files=normal', but this causes performance issues for UI
> tooling that wants to see "all" on a frequent basis. On the other hand,
> the conditions that prevented applicability to the "all" mode no
> longer seem to apply.

There is a slight leap in logic flow before ", but this causes" that
forces readers read the above twice and guess what is missing.  I am
guessing that

    ... was designed to only work with "--untracked-files=normal",
    and is bypassed when "--untracked-files=all" request, but this
    causes ...

is what you meant to say.

The claim in the last sentence "... no longer seem to apply" is
thrown at the readers without much rationale.  Either justify it
more solidly or discard the claim?

> The disqualification of untracked cache has a particularly significant
> impact on Windows with the defaulted fscache, where the introduction
> of fsmonitor can make the first and costly directory-iteration happen
> in untracked file detection, single-threaded, rather than in
> preload-index on multiple threads. Improving the performance of a
> "normal" 'git status' run with fsmonitor can make
> 'git status --untracked-files=all' perform much worse.

The last sentence is a bit hard to parse and very hard to reason
about.  Do you mean to say "--untracked-files=all is slower when the
untracked cache is in use, so the performance of 'git status' may be
improved for '--untracked-files=normal' when the untracked cache is
used, it hurts when 'git status --untracked-files=all' is run"?

> To partially address this, align the supported directory flags for the
> stored untracked cache data with the git config. If a user specifies
> an '--untracked-files=' commandline parameter that does not align with
> their 'status.showuntrackedfiles' config value, then the untracked
> cache will be ignored - as it is for other unsupported situations like
> when a pathspec is specified.
>
> If the previously stored flags no longer match the current
> configuration, but the currently-applicable flags do match the current
> configuration, then discard the previously stored untracked cache
> data.

Let me follow what actually happens in the patch aloud.

> -static void new_untracked_cache(struct index_state *istate)
> +static unsigned new_untracked_cache_flags(struct index_state *istate)
> +{
> +	struct repository *repo = istate->repo;
> +	char *val;
> +
> +	/*
> +	 * This logic is coordinated with the setting of these flags in
> +	 * wt-status.c#wt_status_collect_untracked(), and the evaluation
> +	 * of the config setting in commit.c#git_status_config()
> +	 */
> +	if (!repo_config_get_string(repo, "status.showuntrackedfiles", &val) &&
> +	    !strcmp(val, "all"))
> +		return 0;
> +
> +	/*
> +	 * The default, if "all" is not set, is "normal" - leading us here.
> +	 * If the value is "none" then it really doesn't matter.
> +	 */
> +	return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
> +}

This _guesses_ the user preference from the configuration.

> +static void new_untracked_cache(struct index_state *istate, int flags)
>  {
>  	struct untracked_cache *uc = xcalloc(1, sizeof(*uc));
>  	strbuf_init(&uc->ident, 100);
>  	uc->exclude_per_dir = ".gitignore";
> -	/* should be the same flags used by git-status */
> -	uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
> +	uc->dir_flags = flags >= 0 ? flags : new_untracked_cache_flags(istate);

We use unsigned for the flag word unless there is a reason not to,
but this function wants to use top-bit as a signal to "guess from
config".  The caller either dictates what bits to set, or we guess.

And the created uc records the flags.

> @@ -2762,11 +2782,11 @@ static void new_untracked_cache(struct index_state *istate)
>  void add_untracked_cache(struct index_state *istate)
>  {
>  	if (!istate->untracked) {
> -		new_untracked_cache(istate);
> +		new_untracked_cache(istate, -1);

We are newly creating, so "-1" (guess from config) may be the most
appropriate (unless the caller of add_untracked_cache() already
knows what operation it is using for, that is).

>  	} else {
>  		if (!ident_in_untracked(istate->untracked)) {

Found an existing one but need to recreate.

>  			free_untracked_cache(istate->untracked);
> -			new_untracked_cache(istate);
> +			new_untracked_cache(istate, -1);

In this case, is it likely to give us a better guess to read the
configuration, or copy from the existing untracked file?  My gut
feeling says it would be the latter, and if that is correct, then
we might want

+	      		int old_flags = istate->untracked->dir_flags;
 			free_untracked_cache(istate->untracked);
-			new_untracked_cache(istate);
+			new_untracked_cache(istate, old_flags);

instead?  I dunno.

> @@ -2781,9 +2801,9 @@ void remove_untracked_cache(struct index_state *istate)
>  }
>  
>  static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir,
> -						      int base_len,
> -						      const struct pathspec *pathspec,
> -						      struct index_state *istate)
> +							    int base_len,
> +							    const struct pathspec *pathspec,
> +							    struct index_state *istate)
>  {
>  	struct untracked_cache_dir *root;
>  	static int untracked_cache_disabled = -1;

Is this just re-indenting?  Not very welcome, but OK.

> @@ -2814,17 +2834,9 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
>  	if (base_len || (pathspec && pathspec->nr))
>  		return NULL;
>  
> -	/* Different set of flags may produce different results */
> -	if (dir->flags != dir->untracked->dir_flags ||
> -	    /*
> -	     * See treat_directory(), case index_nonexistent. Without
> -	     * this flag, we may need to also cache .git file content
> -	     * for the resolve_gitlink_ref() call, which we don't.
> -	     */
> -	    !(dir->flags & DIR_SHOW_OTHER_DIRECTORIES) ||
> -	    /* We don't support collecting ignore files */
> -	    (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO |
> -			   DIR_COLLECT_IGNORED)))
> +	/* We don't support collecting ignore files */
> +	if (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO |
> +			DIR_COLLECT_IGNORED))
>  		return NULL;
>  
>  	/*
> @@ -2847,6 +2859,41 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
>  		return NULL;
>  	}
>  
> +	/*
> +	 * We don't support using or preparing the untracked cache if
> +	 * the current effective flags don't match the configured
> +	 * flags.
> +	 */

Is that what we want?  What happens when your user does this:

    - set showuntrackedfiles to "normal"
    - generate the untracked cache
    - set showuntrackedfiles to "all"

And now the user wants to make a request that wants to see flags
that are suitable for "normal".

The best case would be for the untracked cache to know what flags
were in use when it was generated, so that in the above sequence,
even though the current value of configuration variable and the
current request from the command line contradict each other, we
can notice that the untracked cache data is suitable for the current
request and the right thing happens.

> +	if (dir->flags != new_untracked_cache_flags(istate))
> +		return NULL;

But this only pays attention to what is in the configuration?  It
seems to me that it is being too pessimistic, but perhaps it is
necessary for correctness somehow?

Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux