Re: Feature request: better error messages when UTF-8 bites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[]

>
> BTW you actually raise another issue - I do think for file paths git could
> either recompose (NFC) or decompose (NFD) the strings on storage and
> comparison (which should probably be an option... the current default for
> 2.30.2 is to treat them and print them as binary (escaped on print).
> Consider the following when using core.quotePath=false:
>
> $ touch "nfc_$(printf '\xf4')"

This is not valid unicode, isn't it ?
Probably we need to use the octal version, since not all
printf() implementation support hex values starting with 0x,
but all of them support octal:

auml=$(printf '\303\244')
aumlcdiar=$(printf '\141\314\210')


> $ touch "nfd_$(printf '\x6f\xcc\x82')"
> $ git add nf[cd]*
> $ git status
> On branch test
> Changes to be committed:
>   (use "git restore --staged <file>..." to unstage)
>     new file:   nfc_ô
>     new file:   nfd_ô

>
> I'm not sure how the Unicode will be translated here, it might depend on the
> mail client if they even's get sent as-is, but both shows the exact same
> file name, one in NFD and one in NFC format.

Translated by "whom" ?
Most programs do no translate anything here.
>
> Both are canonically equivalent and reversible. It appears MacOS already
> decompose (NFD?) filenames by default and git provides an option to
> recompose the characters (core.precomposeUnicode) which, according to the
> manual, is not even usable on Linux...

Yes. Technically you can have both under Linux, at least unless you
are running ZFS, which may be created unicode-aware (or not, that is the default).

But why do you want to have 2 files on disk with different normalizations ?


>
> More on Unicode normalization: https://unicode.org/reports/tr15/
>
> --
> Thomas




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux