Re: UTF-8 and filenames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le mercredi 14 mars 2007 à 17:03 -0500, Callum Lerwick a écrit :

> Now interpreting the meaning of these bitstreams is a higher level
> display problem. The great thing about having a "case sensitive"
> filesystem is the kernel doesn't have to care about encodings. That
> bloat is pushed to userspace. 

Except userspace has no way to guess the filename encoding: filename
itself is too short to use any sort of euristic, and Linux filesystems
won't provide any other hint.

The only sane thing userspace can do is postulate a system-wide encoding
and display garbage for filenames encoded otherwise (hoping that will
force users to use the default encoding), even if that will fail
spectacularly with removable medias or legacy partitions that use
another convention. Also little help to apps that do something else with
filenames than displaying them.

Casing, sorting is quite another problem. If the encoding is fixed, it
only requires locale knowledge, which is already exported to userspace
reliably.

Also don't forget UTF-8 coverage comes at the price of forbidding some
valid ASCII sequences. So anyone blindly injecting data using legacy
8-bit encoding in an UTF-8 system is asking for trouble (and Linus
refused to enforce UTF-8 safety kernel-side)

-- 
Nicolas Mailhot

--
Fedora-maintainers mailing list
Fedora-maintainers@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-maintainers

--
Fedora-maintainers-readonly mailing list
Fedora-maintainers-readonly@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-maintainers-readonly

[Index of Archives]     [Fedora Users]     [Fedora Development]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]

  Powered by Linux