On Wed, Nov 17, 2021 at 06:06:13PM +0100, Torsten B??gershausen wrote: > On Wed, Nov 17, 2021 at 05:12:26PM +0100, Torsten B??gershausen wrote: > > On Wed, Nov 17, 2021 at 10:17:53AM -0500, Derrick Stolee wrote: > > > I recently had to pave my Linux machine, so I updated it to Ubuntu > > > 21.10 and had the choice to start using the ZFS filesystem. I thought, > > > "Why not?" but now I maybe see why. > > > > > > Running the Git test suite at the v2.34.0 tag on my machine results in > > > these failures: > > > > > > t0050-filesystem.sh (Wstat: 0 Tests: 11 Failed: 0) > > > TODO passed: 9-10 > > > t0021-conversion.sh (Wstat: 256 Tests: 41 Failed: 1) > > > Failed test: 31 > > > Non-zero exit status: 1 > > > t3910-mac-os-precompose.sh (Wstat: 256 Tests: 25 Failed: 10) > > > Failed tests: 1, 4, 6, 8, 11-16 > > > TODO passed: 23 > > > Non-zero exit status: 1 > > > > > > These are all related to the UTF8_NFD_TO_NFC prereq. > > > > > > Zooming in on t0050, these tests are marked as "test_expect_failure" due > > > to an assignment of $test_unicode using the UTF8_NFD_TO_NFC prereq: > > > > > > > > > $test_unicode 'rename (silent unicode normalization)' ' > > > git mv "$aumlcdiar" "$auml" && > > > git commit -m rename > > > ' > > > > > > $test_unicode 'merge (silent unicode normalization)' ' > > > git reset --hard initial && > > > git merge topic > > > ' > > > > > > > > > The prereq creates two files using unicode characters that could > > > collapse to equivalent meanings: > > > > > > > > > test_lazy_prereq UTF8_NFD_TO_NFC ' > > > # check whether FS converts nfd unicode to nfc > > > auml=$(printf "\303\244") > > > aumlcdiar=$(printf "\141\314\210") > > > >"$auml" && > > > test -f "$aumlcdiar" > > > ' > > > > > > > > > What I see in that first test, the 'git mv' does change the > > > index, but the filesystem thinks the files are the same. This > > > may mean that our 'git add "$aumlcdiar"' from an earlier test > > > is providing a non-equivalence in the index, and the 'git mv' > > > changes the index without causing any issues in the filesystem. > > > > > > It reminds me as if we used 'git mv README readme' on a case- > > > insensitive filesystem. Is this not a similar situation? > > > > > > What I'm trying to gather is that maybe this test is flawed? > > > Or maybe something broke (or never worked?) in how we use > > > 'git add' to not get the canonical unicode from the filesystem? > > > > > > The other tests all have similar interactions with 'git add'. > > > I'm hoping that these are just test bugs, and not actually a > > > functionality issue in Git. Yes, it is confusing that we can > > > change the unicode of a file in the index without the filesystem > > > understanding the difference, but that is very similar to how > > > case-insensitive filesystems work and I don't know what else we > > > would do here. > > > > > > These filesystem/unicode things are out of my expertise, so > > > hopefully someone else has a clearer idea of what is going on. > > > I'm happy to be a test bed, or even attempt producing patches > > > to fix the issue once we have that clarity. > > > > > > Thanks, > > > -Stolee > > > > Interesting. > > The tests have always been working on HFS+, then we got > > APFS (and needed a small fix) and now ZFS. > > > > I'll can have a look - just installing in a virtual machine. > > So, the virtual machine is up-and-running. > > I got 2 messages: > > ok 9 - rename (silent unicode normalization) # TODO known breakage vanished > ok 10 - merge (silent unicode normalization) # TODO known breakage vanished > > Do you get the same ? Now I am even more puzzled. running t0050 with -x gives this: Author: A U Thor <author@xxxxxxxxxxx> 1 file changed, 0 insertions(+), 0 deletions(-) rename "a\314\210" => "\303\244" (100%) ok 9 - rename (silent unicode normalization) # TODO known breakage vanished ---------------- When I create a test Git, with one file in ä-decomposed, and rename into ä-precomposed, Git gives me: tb@Ubuntu2021:~/ttt$ git mv "$aumlcdiar" "$auml" fatal: destination exists, source=ä, destination=ä and in hex form: tb@Ubuntu2021:~/ttt$ git mv "$aumlcdiar" "$auml" 2>&1 | xxd 00000000: 6661 7461 6c3a 2064 6573 7469 6e61 7469 fatal: destinati 00000010: 6f6e 2065 7869 7374 732c 2073 6f75 7263 on exists, sourc 00000020: 653d 61cc 882c 2064 6573 7469 6e61 7469 e=a.., destinati 00000030: 6f6e 3dc3 a40a on=...