From: Pali Rohár > Sent: 20 January 2020 15:20 ... > This is not possible. There is 1:1 mapping between UTF-8 sequence and > Unicode code point. wchar_t in kernel represent either one Unicode code > point (limited up to U+FFFF in NLS framework functions) or 2bytes in > UTF-16 sequence (only in utf8s_to_utf16s() and utf16s_to_utf8s() > functions). Unfortunately there is neither a 1:1 mapping of all possible byte sequences to wchar_t (or unicode code points), nor a 1:1 mapping of all possible wchar_t values to UTF-8. Really both need to be defined - even for otherwise 'invalid' sequences. Even the 16-bit values above 0xd000 can appear on their own in windows filesystems (according to wikipedia). It is all to easy to get sequences of values that cannot be converted to/from UTF-8. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)