Hi Kyle, Thanks for the clear report and pointing out relevant commit(s) and discussion. Apologies in advance if I come off a bit ranty; it's directed to dir.c and its API and not you. It seems to be quite the mess... On Tue, Dec 3, 2019 at 2:08 PM Kyle Meyer <kyle@xxxxxxxxxx> wrote: > > 89a1f4aaf7 (dir: if our pathspec might match files under a dir, recurse > into it, 2019-09-17) introduced a change in behavior in terms of > traversing untracked nested repositories. Say we have a repository that > contains a single untracked repository with untracked content: > > $ git init && git init a && touch a/x > > Calling ls-files with the nested repository as the sole pathspec does > not recurse into that repository: > > $ git ls-files --other a > a/ > > However, as of 89a1f4aaf7, adding an additional pathspec results in the > nested repository being traversed: > > $ git ls-files --other a foo > a/ > a/x > > Reading 89a1f4aaf7 and skimming the patch series and related thread [*], > I haven't found anything that makes me think this change in behavior was > intentional. Oh man, I guess I shouldn't be surprised by another area of the code that depends on dir.c but lacks any meaningful tests of its behavior, violated the existing contract[1], and depends on the side effects of other bugs -- bugs which don't cover all cases and thus causes it to get different behavior depending on things that otherwise shouldn't matter. Behold, *before* my changes to dir.c: $ git --version 2.23.0 $ find . -type f | grep -v .git ./empty ./untracked_repo/empty ./untracked_dir/empty ./world $ git ls-files world $ git ls-files -o empty untracked_dir/empty untracked_repo/ So, as you say, it wouldn't traverse into untracked_repo/. Now we start adding pathspecs: $ git ls-files -o untracked_repo untracked_repo/ As you mentioned, it won't traverse into it even when specified... $ git ls-files -o untracked_repo/ untracked_repo/empty ...except that it does traverse into this directory if the user tab completes the name or otherwise manually adds a trailing slash. Weird, let's try multiple pathspecs: $ git ls-files -o untracked_dir untracked_repo untracked_dir/empty untracked_repo/ $ git ls-files -o untracked_dir untracked_repo/ untracked_dir/empty untracked_repo/ So it will traverse into the untracked_repo when specified as 'untracked_repo/' but not if there are more than one pathspec given?!? And it traverses into an untracked directory regardless of the trailing slash? <sarcasm>What a paragon of consistency...</sarcasm> At least my changes in git-2.24.0 made the behavior consistent; it'll always traverse into a directory that matches a given pathspec. As for whether that's desirable or not when the pathspec is a submodule, I'm not certain. My fixes to dir.c stalled out for over a year and a half despite a few reports that they had fixed issues people were continuing to report in the wild because the whole traversal logic was such a mess and there were so many dependencies on existing behavior built up with that mess that it was really hard to determine what "correct" behavior was and others seemed to be unwilling to wade through the muck to figure out where sanity might lie so they could help give me pointers or opinions on what "correct" was[2]. But here are some possibilities that at least sound sane: A) ls-files -o should traverse into untracked submodules. This case is easy; the code already does that. B) ls-files -o should NOT traverse into untracked submodules AND should not even report them. If so, fix looks like this: diff --git a/builtin/ls-files.c b/builtin/ls-files.c index f069a028ce..f144d44d8b 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -301,6 +301,9 @@ static void show_files(struct repository *repo, struct dir_struct *dir) int i; struct strbuf fullname = STRBUF_INIT; + if (!recurse_submodules) + dir->flags |= DIR_SKIP_NESTED_GIT; + /* For cached/deleted files we don't need to even do the readdir */ if (show_others || show_killed) { if (!show_others) C) ls-files -o should NOT traverse into untracked submodules, but should at least report their directory name. If so, the fix is probably to move the DIR_NO_GITLINKS if-block within dir.c:treat_directory() to put it right next to the DIR_SKIP_NESTED_GIT if-block (and maybe even partially combine the two) so that it comes before some of the other code in that function. Might have to be careful about checking for the presence of trailing slashes. It seems like it should be clear which one of these is "correct", but I seem to be short a few brain cycles right now...either that, or maybe my very rare use of submodules and nested repositories means I just don't have enough context to answer. Maybe it's obvious to someone else... Elijah [1] https://lore.kernel.org/git/xmqqefjp6sko.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx/ [2] https://lore.kernel.org/git/20190905154735.29784-1-newren@xxxxxxxxx/