On 2020-03-02, lampahome <pahome.chen@xxxxxxxxxx> wrote: > According to case insensitive since kernel 5.2, d_compare will > transform string into normalized form and then compare. > > But why do we need this normalization function? Could we just compare > by utf8 string? The problem is that there are multiple ways to represent the same glyph in Unicode -- for instance, you can represent Å (the symbol for angstrom) as both U+212B and U+0041 U+030A (the latin letter "A" followed by the ring-above symbol "°"). Different software may choose to represent the same glyphs in different Unicode forms, hence the need for normalisation. [1] is the Wikipedia article that describes this problem and what the different kinds of Unicode normalisation are. [1]: https://en.wikipedia.org/wiki/Unicode_equivalence -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/>
Attachment:
signature.asc
Description: PGP signature