On Fri, Feb 25, 2022 at 8:44 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Tao Klerks via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > If the previously stored flags no longer match the current > > configuration, but the currently-applicable flags do match the current > > configuration, then the previously stored untracked cache data is > > discarded. > > > > For most users there will be no change in behavior. Users who need > > '--untracked-files=all' to perform well will have the option of > > setting "status.showuntrackedfiles" to "all". > > > > Users who need '--untracked-files=all' to perform well for their > > tooling AND prefer to avoid the verbosity of "all" when running > > git status explicitly without options... are out of luck for now (no > > change). > > So, in short, the root of the problem is that untracked cache can be > used to serve only one mode (between normal and all) of operation, > and the real solution to that problem must come in a different form, > i.e. allowing a single unified untracked cache to be usable in both > modes, perhaps by maintaining all the untracked ones in the cache, > but filter out upon output when the 'normal' mode is asked for? I wouldn't call this the root of the problem I was trying to solve with this patch, but rather the root of the remaining problem, yes. The challenge that I can't get my head around for this longer-term approach or objective, is *when* the untracked-folder nested files should be enumerated. Currently, untracked folders are only recursed into if -uall is specified or configured - and that's a significant characteristic: It's perfectly plausible that some users sometimes have huge deep untracked folder hierarchies that take a long time to explore, but git never needs to because they never specify -uall. If we decided to serve both modes with one single untracked cache structure, then we would either need to always fully recurse, or implement some sort of "just-in-time" filling in of the recursive bits when someone specifies -uall for the first time. Either way, I'm pretty sure it's beyond me to do that right. Hence this very-pragmatic approach that makes it *possible* to get good/normal performance with -uall. > > > Users who set "status.showuntrackedfiles" to "all" and yet most > > frequently explicitly call 'git status --untracked-files=normal' (and > > use the untracked cache) are the only ones who would be disadvantaged > > by this change. It seems unlikely there are any such users. > > Given how widely used we are these days, I am afraid that the days > we can safely say "users with such a strange use pattern would not > exist at all" is long gone. > > > +static unsigned configured_default_dir_flags(struct index_state *istate) > > +{ > > + /* This logic is coordinated with the setting of these flags in > > + * wt-status.c#wt_status_collect_untracked(), and the evaluation > > + * of the config setting in commit.c#git_status_config() > > + */ > > Good thing to note here. > > Style. > > /* > * Our multi-line comments reads more like this, with > * slash-asterisk that opens, and asterisk-slash that closes, > * sitting on their own lines. > */ Thanks, sorry, I thought I'd corrected them all but clearly missed some. > > > + char *status_untracked_files_config_value; > > + int config_outcome = repo_config_get_string(istate->repo, > > + "status.showuntrackedfiles", > > The indentation looks a bit off. > > In this project, tab width is 8. The beginning of the second > parameter to the function call on the second line should align with > the beginning of the first parameter that immediately follows the > open parenthesis '('. > Fixed, thx. > > + &status_untracked_files_config_value); > > + if (!config_outcome && !strcmp(status_untracked_files_config_value, "all")) { > > + return 0; > > + } else { > > + /* > > + * The default, if "all" is not set, is "normal" - leading us here. > > + * If the value is "none" then it really doesn't matter. > > + */ > > + return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; > > + } > > +} > > I didn't see the need to pass istate to this function, though. > Shouldn't it take a repository instead? Makes sense, fixed, thx. > > > -static void new_untracked_cache(struct index_state *istate) > > +static void new_untracked_cache(struct index_state *istate, unsigned flags) > > { > > struct untracked_cache *uc = xcalloc(1, sizeof(*uc)); > > strbuf_init(&uc->ident, 100); > > uc->exclude_per_dir = ".gitignore"; > > - /* should be the same flags used by git-status */ > > - uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; > > + uc->dir_flags = flags; > > So we used to hardcode these two flags to match what is done in > wt-status.c when show_untracked_files != SHOW_ALL_UNTRACKEDFILES; > we allow a different set of flags to be given by the caller. > > > @@ -2761,11 +2781,13 @@ static void new_untracked_cache(struct index_state *istate) > > void add_untracked_cache(struct index_state *istate) > > { > > if (!istate->untracked) { > > - new_untracked_cache(istate); > > + new_untracked_cache(istate, > > + configured_default_dir_flags(istate)); > > } else { > > if (!ident_in_untracked(istate->untracked)) { > > free_untracked_cache(istate->untracked); > > - new_untracked_cache(istate); > > + new_untracked_cache(istate, > > + configured_default_dir_flags(istate)); > > } > > } > > } > > OK. That's quite straight-forward to see how the tweak is made. > > > @@ -2781,10 +2803,12 @@ void remove_untracked_cache(struct index_state *istate) > > > > static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, > > int base_len, > > - const struct pathspec *pathspec) > > + const struct pathspec *pathspec, > > + struct index_state *istate) > > { > > struct untracked_cache_dir *root; > > static int untracked_cache_disabled = -1; > > + unsigned configured_dir_flags; > > > > if (!dir->untracked) > > return NULL; > > @@ -2812,17 +2836,9 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > > if (base_len || (pathspec && pathspec->nr)) > > return NULL; > > > > - /* Different set of flags may produce different results */ > > - if (dir->flags != dir->untracked->dir_flags || > > This is removed because we are making sure that dir->flags and > dir->untracked->dir_flags match? > > > - /* > > - * See treat_directory(), case index_nonexistent. Without > > - * this flag, we may need to also cache .git file content > > - * for the resolve_gitlink_ref() call, which we don't. > > - */ > > - !(dir->flags & DIR_SHOW_OTHER_DIRECTORIES) || > > This is removed because...? Because we *do* now support using untracked cache with -uall... As long as the config matches the runtime flags (new check later). > > > - /* We don't support collecting ignore files */ > > - (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO | > > - DIR_COLLECT_IGNORED))) > > > + /* We don't support collecting ignore files */ > > + if (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO | > > + DIR_COLLECT_IGNORED)) > > return NULL; > > > > /* > > @@ -2845,6 +2861,40 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > > return NULL; > > } > > > > + /* We don't support using or preparing the untracked cache if > > + * the current effective flags don't match the configured > > + * flags. > > + */ > > Style (another one in large comment below). > Thx. > > + configured_dir_flags = configured_default_dir_flags(istate); > > + if (dir->flags != configured_dir_flags) > > + return NULL; > > Hmph. If this weren't necessary, this function does not need to > call configured_default_dir_flags(), and it can lose the > configured_dir_flags variable, too. Which means that > new_untracked_cache() function does not need to take the flags word > as a caller-supplied parameter. Instead, it can make a call to > configured_dir_flags() and assign the result to uc->dir_flags > itself, which would have been much nicer. I've tightened this up a little with an inline call to configured_default_dir_flags(), getting rid of the variable, let's see if that makes more sense / is cleaner. Sending new version with these changes. > > > + /* If the untracked structure we received does not have the same flags > > + * as configured, but the configured flags do match the effective flags, > > + * then we need to reset / create a new "untracked" structure to match > > + * the new config. > > + * Keeping the saved and used untracked cache in-line with the > > + * configuration provides an opportunity for frequent users of > > + * "git status -uall" to leverage the untracked cache by aligning their > > + * configuration (setting "status.showuntrackedfiles" to "all" or > > + * "normal" as appropriate), where previously this option was > > + * incompatible with untracked cache and *consistently* caused > > + * surprisingly bad performance (with fscache and fsmonitor enabled) on > > + * Windows. > > + * > > + * IMPROVEMENT OPPORTUNITY: If we reworked the untracked cache storage > > + * to not be as bound up with the desired output in a given run, > > + * and instead iterated through and stored enough information to > > + * correctly serve both "modes", then users could get peak performance > > + * with or without '-uall' regardless of their > > + * "status.showuntrackedfiles" config. > > + */ > > + if (dir->flags != dir->untracked->dir_flags) { > > + free_untracked_cache(istate->untracked); > > + new_untracked_cache(istate, configured_dir_flags); > > + dir->untracked = istate->untracked; > > + } > > > This compensates what we lost in the validate_untracked_cache() > above by making them match. Looking reasonable. > > > if (!dir->untracked->root) > > FLEX_ALLOC_STR(dir->untracked->root, name, ""); > > > > @@ -2916,7 +2966,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, > > return dir->nr; > > } > > > > - untracked = validate_untracked_cache(dir, len, pathspec); > > + untracked = validate_untracked_cache(dir, len, pathspec, istate); > > if (!untracked) > > /* > > * make sure untracked cache code path is disabled, > > > > base-commit: dab1b7905d0b295f1acef9785bb2b9cbb0fdec84