On 11/30/20 10:52 AM, Roman Danyliw wrote:
If one visits,https://www.rfc-editor.org/rfc/rfc7230.txt, is a TXT not returned?
How would you know? If you visit a file from a browser, you only know
what the browser shows you.
If for instance I click on that link and then hit ^U in the browser
(which happens to be Brave), what I now see appears to be text with line
numbers. I can't tell if the formfeeds are there or not.
If I do the same in firefox, I get text without line numbers. The
formfeeds actually appear to be there - they show up as squares with
"000C" in them.
In chromium, I get the same behavior as Brave.
In the past, I've gotten other results, such as seeing HTML tags
embedded in what is supposedly the "source" of the plain text file.
The general problem is that when you use a web browser to view something
and then save or print it, the behavior is undefined. No standard says
what should happen, and you don't know what you're going to get.
But a related problem is that people keep gratuitously changing things.
So even if you find a workaround to some bit of damage, that workaround
is not assured to work in the future.
To be clear, we're just talking about damage caused by web browsers
here. I doubt that anyone is clicking on a .txt and getting .html or
.pdf from the web server.
But this illustrates why some of us prefer to avoid using web browsers
for some things, and instead rely on tools that have well-defined behavior.
Also, I don't think the HTTP protocol corrupts files, though I have seen
software that would silently ignore an incomplete HTTP file transfer.
But if what you actually need to do is browse files to pick out the ones
you want, vanilla HTTP doesn't provide what a tool needs to reliably do
that.
Keith