Re: Should the IETF be condoning, even promoting, BOM pollution?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/18/17 17:57, Ted Lemon wrote:
On Sep 18, 2017, at 6:24 PM, Adam Roach <adam@xxxxxxxxxxx> wrote:
Unless you know something about NTFS, ext4, HFS, and exFAT that I don't, this sort of information isn't generally part of file metadata at all.
If you download a file in your web browser and save it to disk, the thing responsible for deciding whether or not to apply the BOM is the thing that did the download, not the server from which it was downloaded.   The server already identified the file encoding type: utf8 (not text/utf8, sorry about that).   If the thing that did the download does the wrong thing, that's not our problem.


I think we're talking at cross purposes here.

Today, as we speak, I have a copy of the RFC repository on my hard drive. (To be precise, I have it on most of the hard drives of the various machines that I use). For my current workflow, I *think* all of them got there via rsync, although it's possible that some of them are still using an old wget-based setup. It's kind of immaterial how they got there, because a careful examination of them would show the same result between the two methods (and any others I could think of, including FTP mirroring and manually downloading via web browsers): it's a sequence of bytes, with a ".txt" file extension; identical, regardless of which tool downloaded them. There is nothing else about the file to indicate its encoding.[1]

Okay. So, now, I open up the local file browser to that file on my hard drive, and double-click on an RFC. An application is launched. Let's say that application is Wordpad. How does it know which character encoding to use for this file?

/a


____
[1] If this is one of the Macs, and the download tool were really Mac-centric, it might have included a resource fork with some additional metadata, but (AFIAK), even the resource fork does not include character encoding. Other operating systems have similar constructs, but I'm less familiar with them.




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]