Re: [Question] Unicode weirdness breaking tests on ZFS?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/17/2021 12:39 PM, Torsten =?unknown-8bit?Q?B=C3=B6gershausen?= wrote:
> On Wed, Nov 17, 2021 at 06:06:13PM +0100, Torsten B??gershausen wrote:
>> On Wed, Nov 17, 2021 at 05:12:26PM +0100, Torsten B??gershausen wrote:
>>> On Wed, Nov 17, 2021 at 10:17:53AM -0500, Derrick Stolee wrote:
>>>> I recently had to pave my Linux machine, so I updated it to Ubuntu
>>>> 21.10 and had the choice to start using the ZFS filesystem. I thought,
>>>> "Why not?" but now I maybe see why.
>>>>
>>>> Running the Git test suite at the v2.34.0 tag on my machine results in
>>>> these failures:
>>>>
>>>> t0050-filesystem.sh                   (Wstat: 0 Tests: 11 Failed: 0)
>>>>   TODO passed:   9-10
>>>> t0021-conversion.sh                   (Wstat: 256 Tests: 41 Failed: 1)
>>>>   Failed test:  31
>>>>   Non-zero exit status: 1
>>>> t3910-mac-os-precompose.sh            (Wstat: 256 Tests: 25 Failed: 10)
>>>>   Failed tests:  1, 4, 6, 8, 11-16
>>>>   TODO passed:   23
>>>>   Non-zero exit status: 1
>>>>
>>>> These are all related to the UTF8_NFD_TO_NFC prereq.
>>>>
>>>> Zooming in on t0050, these tests are marked as "test_expect_failure" due
>>>> to an assignment of $test_unicode using the UTF8_NFD_TO_NFC prereq:
>>>>
>>>>
>>>> $test_unicode 'rename (silent unicode normalization)' '
>>>> 	git mv "$aumlcdiar" "$auml" &&
>>>> 	git commit -m rename
>>>> '
>>>>
>>>> $test_unicode 'merge (silent unicode normalization)' '
>>>> 	git reset --hard initial &&
>>>> 	git merge topic
>>>> '
>>>>
>>>>
>>>> The prereq creates two files using unicode characters that could
>>>> collapse to equivalent meanings:
>>>>
>>>>
>>>> test_lazy_prereq UTF8_NFD_TO_NFC '
>>>> 	# check whether FS converts nfd unicode to nfc
>>>> 	auml=$(printf "\303\244")
>>>> 	aumlcdiar=$(printf "\141\314\210")
>>>> 	>"$auml" &&
>>>> 	test -f "$aumlcdiar"
>>>> '
>>>>
>>>>
>>>> What I see in that first test, the 'git mv' does change the
>>>> index, but the filesystem thinks the files are the same. This
>>>> may mean that our 'git add "$aumlcdiar"' from an earlier test
>>>> is providing a non-equivalence in the index, and the 'git mv'
>>>> changes the index without causing any issues in the filesystem.
>>>>
>>>> It reminds me as if we used 'git mv README readme' on a case-
>>>> insensitive filesystem. Is this not a similar situation?
>>>>
>>>> What I'm trying to gather is that maybe this test is flawed?
>>>> Or maybe something broke (or never worked?) in how we use
>>>> 'git add' to not get the canonical unicode from the filesystem?
>>>>
>>>> The other tests all have similar interactions with 'git add'.
>>>> I'm hoping that these are just test bugs, and not actually a
>>>> functionality issue in Git. Yes, it is confusing that we can
>>>> change the unicode of a file in the index without the filesystem
>>>> understanding the difference, but that is very similar to how
>>>> case-insensitive filesystems work and I don't know what else we
>>>> would do here.
>>>>
>>>> These filesystem/unicode things are out of my expertise, so
>>>> hopefully someone else has a clearer idea of what is going on.
>>>> I'm happy to be a test bed, or even attempt producing patches
>>>> to fix the issue once we have that clarity.
>>>>
>>>> Thanks,
>>>> -Stolee
>>>
>>> Interesting.
>>> The tests have always been working on HFS+, then we got
>>> APFS (and needed a small fix) and now ZFS.
>>>
>>> I'll can have a look - just installing in a virtual machine.
>>
>> So, the virtual machine is up-and-running.
>>
>> I got 2 messages:
>>
>> ok 9 - rename (silent unicode normalization) # TODO known breakage vanished
>> ok 10 - merge (silent unicode normalization) # TODO known breakage vanished
>>
>> Do you get the same ?

Halfway, I see this:

ok 9 - rename (silent unicode normalization) # TODO known breakage vanished
not ok 10 - merge (silent unicode normalization) # TODO known breakage

> Now I am even more puzzled.
> running t0050 with -x gives this:
> 
>  Author: A U Thor <author@xxxxxxxxxxx>
>   1 file changed, 0 insertions(+), 0 deletions(-)
>    rename "a\314\210" => "\303\244" (100%)
>    ok 9 - rename (silent unicode normalization) # TODO known breakage vanished
> 
> 
> ----------------
> When I create a test Git, with one file in ä-decomposed,
> and rename into ä-precomposed, Git gives me:
> 
> tb@Ubuntu2021:~/ttt$ git mv "$aumlcdiar" "$auml"
> fatal: destination exists, source=ä, destination=ä
> 
> and in hex form:
> 
> tb@Ubuntu2021:~/ttt$ git mv "$aumlcdiar" "$auml" 2>&1 | xxd
> 00000000: 6661 7461 6c3a 2064 6573 7469 6e61 7469  fatal: destinati
> 00000010: 6f6e 2065 7869 7374 732c 2073 6f75 7263  on exists, sourc
> 00000020: 653d 61cc 882c 2064 6573 7469 6e61 7469  e=a.., destinati
> 00000030: 6f6e 3dc3 a40a                           on=...
 
Interesting: does this "fatal" error not change the exit code? Oddly,
I don't get that failure under -x:

checking known breakage of 0050.9 'rename (silent unicode normalization)': 
        git mv "$aumlcdiar" "$auml" &&
        git commit -m rename

+ git mv ä ä
+ git commit -m rename
[main 591d19c] rename
 Author: A U Thor <author@xxxxxxxxxxx>
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename "a\314\210" => "\303\244" (100%)
ok 9 - rename (silent unicode normalization) # TODO known breakage vanished

checking known breakage of 0050.10 'merge (silent unicode normalization)': 
        git reset --hard initial &&
        git merge topic

+ git reset --hard initial
error: unable to unlink old 'ä': No such file or directory
fatal: Could not reset index file to revision 'initial'.
error: last command exited with $?=128
not ok 10 - merge (silent unicode normalization) # TODO known breakage


But notice that -x does make test 10 go back to failing.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux