On Sun, Jan 19, 2020 at 11:14:55PM +0100, Pali Rohár wrote: > So when UTF-8 on VFS for VFAT is enabled, then for VFS <--> VFAT > conversion are used utf16s_to_utf8s() and utf8s_to_utf16s() functions. > But in fat_name_match(), vfat_hashi() and vfat_cmpi() functions is used > NLS table (default iso8859-1) with nls_strnicmp() and nls_tolower(). > > Which means that fat_name_match(), vfat_hashi() and vfat_cmpi() are > broken for vfat in UTF-8 mode. > > I was thinking how to fix it, and the only possible way is to write a > uni_tolower() function which takes one Unicode code point and returns > lowercase of input's Unicode code point. We cannot do any Unicode > normalization as VFAT specification does not say anything about it and > MS reference fastfat.sys implementation does not do it neither. Then how can that possibly be broken? If it matches the native behaviour, that's it. > As you can see lowercase 'd' and uppercase 'D' are same, but lowercase > 'č' and uppercase 'Č' are not same. This is because 'č' is two bytes > 0xc4 0x8d sequence and comparing is done by Latin1 table. 0xc4 is in > Latin 'Ä' which is already in uppercase. 0x8d is control char so is not > changed by tolower/toupper function. Again, who the hell cares? Does the behaviour match how Windows handles that thing? "Case" is not something well-defined; the only definition is "whatever weird crap does the native implementation choose to do". That's the only reason to support that garbage at all...