Re: [Question] Unicode weirdness breaking tests on ZFS?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 17 2021, Derrick Stolee wrote:

> I recently had to pave my Linux machine, so I updated it to Ubuntu
> 21.10 and had the choice to start using the ZFS filesystem. I thought,
> "Why not?" but now I maybe see why.
>
> Running the Git test suite at the v2.34.0 tag on my machine results in
> these failures:
>
> t0050-filesystem.sh                   (Wstat: 0 Tests: 11 Failed: 0)
>   TODO passed:   9-10
> t0021-conversion.sh                   (Wstat: 256 Tests: 41 Failed: 1)
>   Failed test:  31
>   Non-zero exit status: 1
> t3910-mac-os-precompose.sh            (Wstat: 256 Tests: 25 Failed: 10)
>   Failed tests:  1, 4, 6, 8, 11-16
>   TODO passed:   23
>   Non-zero exit status: 1
>
> These are all related to the UTF8_NFD_TO_NFC prereq.
>
> Zooming in on t0050, these tests are marked as "test_expect_failure" due
> to an assignment of $test_unicode using the UTF8_NFD_TO_NFC prereq:
>
>
> $test_unicode 'rename (silent unicode normalization)' '
> 	git mv "$aumlcdiar" "$auml" &&
> 	git commit -m rename
> '
>
> $test_unicode 'merge (silent unicode normalization)' '
> 	git reset --hard initial &&
> 	git merge topic
> '
>
>
> The prereq creates two files using unicode characters that could
> collapse to equivalent meanings:
>
>
> test_lazy_prereq UTF8_NFD_TO_NFC '
> 	# check whether FS converts nfd unicode to nfc
> 	auml=$(printf "\303\244")
> 	aumlcdiar=$(printf "\141\314\210")
> 	>"$auml" &&
> 	test -f "$aumlcdiar"
> '
>
>
> What I see in that first test, the 'git mv' does change the
> index, but the filesystem thinks the files are the same. This
> may mean that our 'git add "$aumlcdiar"' from an earlier test
> is providing a non-equivalence in the index, and the 'git mv'
> changes the index without causing any issues in the filesystem.
>
> It reminds me as if we used 'git mv README readme' on a case-
> insensitive filesystem. Is this not a similar situation?
>
> What I'm trying to gather is that maybe this test is flawed?
> Or maybe something broke (or never worked?) in how we use
> 'git add' to not get the canonical unicode from the filesystem?
>
> The other tests all have similar interactions with 'git add'.
> I'm hoping that these are just test bugs, and not actually a
> functionality issue in Git. Yes, it is confusing that we can
> change the unicode of a file in the index without the filesystem
> understanding the difference, but that is very similar to how
> case-insensitive filesystems work and I don't know what else we
> would do here.
>
> These filesystem/unicode things are out of my expertise, so
> hopefully someone else has a clearer idea of what is going on.
> I'm happy to be a test bed, or even attempt producing patches
> to fix the issue once we have that clarity.

I haven't used ZFS, but this points to non-POSIX behavior on the FS
itself. It looks like tweaking the "normalization" property might change
it, see: https://manpages.ubuntu.com/manpages/eoan/man8/zfs.8.html

There's also "casesensitivity" and "utf8only".

We probably don't want to invoke some ZFS command on every test to
interrogate this, but if we can pass it down from GIT-BUILD-OPTIONS or
similar then we could have a test prereq check this.

Or perhaps it's as simple as changing the "UTF8_NFD_TO_NFC" prereq from
doing a "test -f" to e.g. "echo *" and seeing what it gets back. Perhaps
ZFS says "yes" to "it exists?" but when doing a readdir() it will
canonicalize?




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux