[] > > BTW you actually raise another issue - I do think for file paths git could > either recompose (NFC) or decompose (NFD) the strings on storage and > comparison (which should probably be an option... the current default for > 2.30.2 is to treat them and print them as binary (escaped on print). > Consider the following when using core.quotePath=false: > > $ touch "nfc_$(printf '\xf4')" This is not valid unicode, isn't it ? Probably we need to use the octal version, since not all printf() implementation support hex values starting with 0x, but all of them support octal: auml=$(printf '\303\244') aumlcdiar=$(printf '\141\314\210') > $ touch "nfd_$(printf '\x6f\xcc\x82')" > $ git add nf[cd]* > $ git status > On branch test > Changes to be committed: > (use "git restore --staged <file>..." to unstage) > new file: nfc_ô > new file: nfd_ô > > I'm not sure how the Unicode will be translated here, it might depend on the > mail client if they even's get sent as-is, but both shows the exact same > file name, one in NFD and one in NFC format. Translated by "whom" ? Most programs do no translate anything here. > > Both are canonically equivalent and reversible. It appears MacOS already > decompose (NFD?) filenames by default and git provides an option to > recompose the characters (core.precomposeUnicode) which, according to the > manual, is not even usable on Linux... Yes. Technically you can have both under Linux, at least unless you are running ZFS, which may be created unicode-aware (or not, that is the default). But why do you want to have 2 files on disk with different normalizations ? > > More on Unicode normalization: https://unicode.org/reports/tr15/ > > -- > Thomas