On Wed, Nov 17 2021, Derrick Stolee wrote: > I recently had to pave my Linux machine, so I updated it to Ubuntu > 21.10 and had the choice to start using the ZFS filesystem. I thought, > "Why not?" but now I maybe see why. > > Running the Git test suite at the v2.34.0 tag on my machine results in > these failures: > > t0050-filesystem.sh (Wstat: 0 Tests: 11 Failed: 0) > TODO passed: 9-10 > t0021-conversion.sh (Wstat: 256 Tests: 41 Failed: 1) > Failed test: 31 > Non-zero exit status: 1 > t3910-mac-os-precompose.sh (Wstat: 256 Tests: 25 Failed: 10) > Failed tests: 1, 4, 6, 8, 11-16 > TODO passed: 23 > Non-zero exit status: 1 > > These are all related to the UTF8_NFD_TO_NFC prereq. > > Zooming in on t0050, these tests are marked as "test_expect_failure" due > to an assignment of $test_unicode using the UTF8_NFD_TO_NFC prereq: > > > $test_unicode 'rename (silent unicode normalization)' ' > git mv "$aumlcdiar" "$auml" && > git commit -m rename > ' > > $test_unicode 'merge (silent unicode normalization)' ' > git reset --hard initial && > git merge topic > ' > > > The prereq creates two files using unicode characters that could > collapse to equivalent meanings: > > > test_lazy_prereq UTF8_NFD_TO_NFC ' > # check whether FS converts nfd unicode to nfc > auml=$(printf "\303\244") > aumlcdiar=$(printf "\141\314\210") > >"$auml" && > test -f "$aumlcdiar" > ' > > > What I see in that first test, the 'git mv' does change the > index, but the filesystem thinks the files are the same. This > may mean that our 'git add "$aumlcdiar"' from an earlier test > is providing a non-equivalence in the index, and the 'git mv' > changes the index without causing any issues in the filesystem. > > It reminds me as if we used 'git mv README readme' on a case- > insensitive filesystem. Is this not a similar situation? > > What I'm trying to gather is that maybe this test is flawed? > Or maybe something broke (or never worked?) in how we use > 'git add' to not get the canonical unicode from the filesystem? > > The other tests all have similar interactions with 'git add'. > I'm hoping that these are just test bugs, and not actually a > functionality issue in Git. Yes, it is confusing that we can > change the unicode of a file in the index without the filesystem > understanding the difference, but that is very similar to how > case-insensitive filesystems work and I don't know what else we > would do here. > > These filesystem/unicode things are out of my expertise, so > hopefully someone else has a clearer idea of what is going on. > I'm happy to be a test bed, or even attempt producing patches > to fix the issue once we have that clarity. I haven't used ZFS, but this points to non-POSIX behavior on the FS itself. It looks like tweaking the "normalization" property might change it, see: https://manpages.ubuntu.com/manpages/eoan/man8/zfs.8.html There's also "casesensitivity" and "utf8only". We probably don't want to invoke some ZFS command on every test to interrogate this, but if we can pass it down from GIT-BUILD-OPTIONS or similar then we could have a test prereq check this. Or perhaps it's as simple as changing the "UTF8_NFD_TO_NFC" prereq from doing a "test -f" to e.g. "echo *" and seeing what it gets back. Perhaps ZFS says "yes" to "it exists?" but when doing a readdir() it will canonicalize?