On Thu 07-06-18 10:15:57, Linus Torvalds wrote: > On Thu, Jun 7, 2018 at 9:59 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > Yeah, it's a totally broken format, but we shouldn't be thinking that > > filenames which come to us in UTF16 are actually in UCS2. > > Ok, old fixed-2-byte UCS2 is certainly even worse than UTF16, so no > argument on that side. > > I was more wondering who actually *does* this, but it sounds like it > was a mostly just that we used to do the old-style UCS-2, and this is > extending it to the slightly less broken "extended UCS-2" aka UTF-16. > > I'd just have liked to see some more background in the logs, because > this seemed to me such an odd change to do that it made me go "why > would anybody ever care?". > > But I guess MS (and maybe even OSX) _)still_ haven't gotten the memo > on utf-8 and actually use UTF-16.. > > Oh well. Yes, Windows still apparently use UTF-16 (at least according to a user report I've got which motivated this work) and the OSTA UDF standard defines that filenames in UDF are in "OSTA Compressed Unicode" which may be actually UTF-16 if the program creating the filesystem image chooses so. I agree that the changelogs should have mentioned this. My bad. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR