On Thu, May 6, 2021 at 9:27 PM Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > > On Fri, May 7, 2021 at 12:05 AM Elijah Newren via GitGitGadget > <gitgitgadget@xxxxxxxxx> wrote: > > PNPM is apparently creating deeply nested (but ignored) directory > > structures; traversing them is costly performance-wise, unnecessary, and > > in some cases is even throwing warnings/errors because the paths are too > > long to handle on various platforms. Add a testcase that demonstrates > > this problem. > > > > Signed-off-by: Elijah Newren <newren@xxxxxxxxx> > > --- > > diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh > > @@ -746,4 +746,44 @@ test_expect_success 'clean untracked paths by pathspec' ' > > +test_expect_failure 'avoid traversing into ignored directories' ' > > + test_when_finished rm -f output error && > > + test_create_repo avoid-traversing-deep-hierarchy && > > + ( > > + cd avoid-traversing-deep-hierarchy && > > + > > + >directory-random-file.txt && > > + # Put this file under directory400/directory399/.../directory1/ > > + depth=400 && > > + for x in $(test_seq 1 $depth); do > > + mkdir "tmpdirectory$x" && > > + mv directory* "tmpdirectory$x" && > > + mv "tmpdirectory$x" "directory$x" > > + done && > > Is this expensive/slow loop needed because you'd otherwise run afoul > of command-line length limits on some platforms if you tried creating > the entire mess of directories with a single `mkdir -p`? The whole point is creating a path long enough that it runs afoul of limits, yes. If we had an alternative way to check whether dir.c actually recursed into a directory, then I could dispense with this and just have a single directory (and it could be named a single character long for that matter too), but I don't know of a good way to do that. (Some possiibilities I considered along that route are mentioned at https://lore.kernel.org/git/CABPp-BF3e+MWQAGb6ER7d5jqjcV=kYqQ2stM_oDyaqvonPPPSw@xxxxxxxxxxxxxx/) > > + git clean -ffdxn -e directory$depth >../output 2>../error && > > + > > + test_must_be_empty ../output && > > + # We especially do not want things like > > + # "warning: could not open directory " > > + # appearing in the error output. It is true that directories > > + # that are too long cannot be opened, but we should not be > > + # recursing into those directories anyway since the very first > > + # level is ignored. > > + test_must_be_empty ../error && > > + > > + # alpine-linux-musl fails to "rm -rf" a directory with such > > + # a deeply nested hierarchy. Help it out by deleting the > > + # leading directories ourselves. Super slow, but, what else > > + # can we do? Without this, we will hit a > > + # error: Tests passed but test cleanup failed; aborting > > + # so do this ugly manual cleanup... > > + while test ! -f directory-random-file.txt; do > > + name=$(ls -d directory*) && > > + mv $name/* . && > > + rmdir $name > > + done > > Shouldn't this cleanup loop be under the control of > test_when_finished() to ensure it is invoked regardless of how the > test exits? I thought about that, but if the test fails, it seems nicer to leave everything behind so it can be inspected. It's similar to test_done, which will only delete the $TRASH_DIRECTORY if all the tests passed. So no, I don't think this should be under the control of test_when_finished.