On Thu, Jan 30, 2020 at 9:13 AM Elijah Newren <newren@xxxxxxxxx> wrote: > > On Thu, Jan 30, 2020 at 7:55 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: [...] > > > @@ -1713,18 +1719,101 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > > > > > > /* This is the "show_other_directories" case */ > > > > > > - if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) > > > + /* > > > + * We only need to recurse into untracked/ignored directories if > > > + * either of the following bits is set: > > > + * - DIR_SHOW_IGNORED_TOO (because then we need to determine if > > > + * there are ignored directories below) > > > + * - DIR_HIDE_EMPTY_DIRECTORIES (because we have to determine if > > > + * the directory is empty) > > > > Perhaps here is where you could also have a DIR_LIST_ALL_UNTRACKED > > flag to ensure the untracked cache loads all untracked paths? > > Do you mean DIR_KEEP_UNTRACKED_CONTENTS (which is documented in dir.h > as only having meaning when DIR_SHOW_IGNORED_TOO is also set, and thus > caused me to not list it separately)? > > Speaking of DIR_KEEP_UNTRACKED_CONTENTS, though, its handling as a > post-processing step in read_directory() is now inconsistent with how > we handle squashing a directory full of ignores into just marking the > containing directory as ignored. I think I should move the > read_directory() logic for DIR_KEEP_UNTRACKED_CONTENTS to > treat_directory() and use another counter similar to old_ignored_nr. > It should be more efficient that way, too. Oh, actually, I think I understand what you're getting at so let me clear it up. With DIR_SHOW_IGNORED_TOO, we always recurse to the bottom, because it's needed to find any files that might be ignored. (Maybe we could do something clever with checking .gitignore entries and seeing if it's impossible for them to match anything below the current directory, but the code doesn't do anything that clever.) As a side effect, we'll get all untracked files whenever that flag is set. As such, the only question is whether we want to keep all those extra untracked files that we found or not, which is the purpose of DIR_KEEP_UNTRACKED_CONTENTS. Without DIR_SHOW_IGNORED_TOO, there's no need or want to visit all untracked files without also learning of all ignored files (and, in fact, git-clean is currently the only one that wants to know about all untracked files). As far as a simple test goes, in a simple repository with a file named one/two/three/four/five/untracked-file and with nothing else under one/: Before my changes: $ strace -e trace=file git status --ignored 2>&1 | grep 'open("one/' | grep -v gitignore.*ENOENT | wc -l 62 Note that 62 == 2^5 + 2^4 + 2^3 + 2^2 + 2^1, showing how many directories we open and read. After my changes: $ strace -e trace=file git status --ignored 2>&1 | grep 'open("one/' | grep -v gitignore.*ENOENT | wc -l 5 showing that it does open and read each directory, but does so only once.