On Tue, Dec 26, 2017 at 1:45 AM, Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > On Mon, Dec 25 2017, Duy Nguyen jotted: > >> On Fri, Dec 22, 2017 at 9:00 PM, Ævar Arnfjörð Bjarmason >> <avarab@xxxxxxxxx> wrote: >>> The untracked cache gets confused when a directory is swapped out for >>> a symlink to another directory. Whatever files are inside the target >>> of the symlink will be incorrectly shown as untracked. This issue does >>> not happen if the symlink links to another file, only if it links to >>> another directory. >> >> Sounds about right (I completely forgot about dir symlinks). Since >> I've been away for some time and have not caught up (probably cannot) >> with the mailing list yet, is anyone working on this? It may be >> easiest to just detect symlinksand disable the cache for now. > > I haven't yet, I wanted to see what you had to say about it, > i.e. whether it was a "do'h here's a fix" or if it was more involved > than that. OK I have more time to actually read your test and play with it (last time I stopped at the word "symlink"). First thing out of the way first, I think the stat() call in valid_cached_dir() is wrong. We don't want to automatically follow a symlink, we want stat of the symlink itself. OK back to the test. If you insert test-dump-untracked-cached around the "git checkout master" line, then we get the data that the next/faulty git status uses. For me it shows this ... / 0000000000000000000000000000000000000000 recurse /one/ 0000000000000000000000000000000000000000 recurse valid /two/ 0000000000000000000000000000000000000000 recurse valid OK so we have two uptodate cached dirs for "one" and "two". Strangely, root dir is not cached (no valid flag). I don't know why but that's ok we'll roll with it. It means we will ignore "/" content and do the opendir() and readdir() dance instead. This is where it gets fun, when readdir() returns "one", we hit this code in treat_one_path() and correctly ignores it /* Always exclude indexed files */ if (dtype != DT_DIR && has_path_in_index) return path_none; and we do _nothing_ towards the cached version of "one". In constrast, when readdir() returns "two" we are able to get further and invalidate it, deleting the cached data. After the readdir() dance is complete, dir.c is confident that it has all the good data in the world in its cache and notes down "from now on uses the cached data for /". Another test-dump-... after the second-to-last git-status can show this "valid" flag. Unfortunately the "one" dir is NOT invalidated. So the last git-status sees that cached "/" is good, it skips opendir() and goes with the cached directories which are "two" and the bad "one". In short, we fail to invalidate bad data in some case. An invalidate_directory() call before the "return path_none" above might be the solution... -- Duy