On Tue, 2013-03-19 at 12:24 +0100, Nicolas Mailhot wrote: > > Le Mar 19 mars 2013 11:38, Ian Malone a écrit : > > > and holding up the release for what is basically a triviality seems a > > bit silly. > > The perception correct UTF-8 handling is a triviality that should be > worked on at some later date is the reason we have this breakage now. No. As I understand it, this bug would have happened if we were still in the 20th century and using the legacy 8-bit encodings too. We have an 'is it text?' function which arbitrarily allows 2% of bytes to be >= 0x80. Which means that even in ISO8859-1, a file containing just the words "Schrödinger's Cat" wouldn't be considered to be text. It's just broken; it's not even UTF-8 specific. In fact, UTF-8 makes things *easier* because you can check for valid UTF-8 byte sequences instead of just bytes >= 0x80. -- dwmw2
<<attachment: smime.p7s>>
-- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel