This long email asks for no one's close attention but Florian's. Other readers can skim the email or skip it, at their discretion. On Wed, Oct 20, 2021 at 10:12:02AM +0200, Florian Weimer wrote: > > > What does this mean? I think only byte 0x2f is reserved. The UTF-8 > > > comment is misleading. A historic/overlong encoding of / in multiple > > > UTF-8 bytes is *not* reserved. > > > > I had not known that UTF-8 had an alternate encoding for any ASCII > > character. Does it indeed have an alternate encoding? If so, where > > can I learn more? > > See the Security Considerations section in the RFC: > > <https://datatracker.ietf.org/doc/html/rfc3629#section-10> > > Most file systems do not treat file names as UTF-8, so they do not > perform any validation. I see. That RFC explains it well: there exists no legal alternate encoding, but rather several illegal encodings that, were they not illegal, *would be* alternate encodings. In the case of the solidus, the legal encoding is 2F but the illegal encodings are C0 AF E0 80 AF F0 80 80 AF F8 80 80 80 AF FC 80 80 80 80 AF This problem has nothing to do with Unicode but is merely an artifact of UTF-8 -- and that's your point, isn't it? Most filesystems do not care about UTF-8, so they do not perform any validation. In view of your advice, I should think about how to rewrite the relevant prose so that it is neither [i] confusing to inexperienced users nor [ii] inaccurate. Question: the filename(7) manual page ought to emphasize the requirements of filesystems widely deployed for general-purpose use on standard Linux installations. As far as I know, exactly three such filesystems exist: * ext4; * xfs; * btrfs. Do any other such filesystems exist? Comments: 1. I have heard of reiserfs and reiser4 but have not heard of anyone that actually uses them since about 15 years ago. 2. There are also nfs, iso9660/joliet/rockridge, vfat, ntfs, cifs and a few others. These are network-oriented, archive-oriented, special-purpose, foreign and/or compatibility-oriented filesystems. If the filename(7) manual page mentions the requirements of such filesystems at all, it should mention them only briefly, in passing. Otherwise, the page would become too confusing and grow too long. (Also, I know too little about most of these extra filesystems to write about them.) 3. Happily, the three main filesystems -- ext4, xfs and btrfs -- all have similar filename requirements as far as I know.
Attachment:
signature.asc
Description: PGP signature