On 12/1/20 11:40 AM, Phillip Hallam-Baker wrote:
Of course we
could have stuck to vinyl. But FTP is more like grandad's 78s.
There is absolutely nothing to recommend FTP over HTTP. Rsync is
vastly superior for file transfer.
False, on multiple levels.
The biggest flaw of HTTP for file transfer (if one wants to
transfer more than one file) is that HTTP doesn't have a built-in
way to list files, distinguish files from directories from other
kinds of nodes, and walk a file system.
Rsync, certainly at the time that HTTP was designed (it may have
improved since then) had a LOT of overhead because it tried to
analyze each file for changes within the file, minimizing
bandwidth used (which to be fair, was quite scarce) at a cost of
CPU time and latency. Circa 1993 I looked at using rsync to
replicate a web site to multiple locations (early CDN I suppose)
and found it completely inadequate.
Of course FTP was designed for file transfer, including between
dissimilar systems (which were very common in ARPAnet days), and
HTTP wasn't designed for that purpose. There was nothing wrong
with designing a new protocol for the web especially since the web
had different needs, different assumptions, and operated under
different conditions.
But the web has NEVER been a good way to do file transfer.
Wasn't in 1991, and isn't today. And the protocol designed for
the web isn't either without adding some additional features.
The original attack made was that HTTP was unnecessary because FTP existed. So the fact that FTP can do other things is utterly irrelevant to what I was responding to. We wrote a new protocol because FTP is a really shitty hypertext transfer protocol. That wasn't an arrogant or ignorant approach, libwww always had an FTP client in it.
Speaking of false...
Of course I have the benefit of knowing what I am talking about having actually tried to make FTP work. And as of 1993 it simply didn't as you would know had you tried it.
...
Having (re)implemented FTP for LIBWWW, I can assure you that it was done for very good reason.
FTP is not really an independent protocol. It is actually designed as a feature add on for Telnet. As a result, FTP is vastly less efficient than HTTP because you have two socket creations and teardowns per connection.
FTP needs only N+1 connections per set of N transfers from a given site.
Not in 1993 it didn't. I took a look at the practicality of caching the connections for multiple transfers from the same site and it plain didn't work because of all sorts of technical issues such as socket exhaustion and the limited capabilities of the FTP servers of the era. At the time HTTP was written, HTML didn't support any form of transclusion so it was one transfer per page.
The problem with caching the connection is that the FTP servers of the era were typically limited to 8 simultaneous users. So if browsers kept the control connection open, they would prevent other users connecting. Sure you could rewrite the FTP server but why do that when you can write a new server with less effort?
Oh and that is before we start to look at CGI etc.
The gap between the theory as you understand it and implementation at the time was huge.
I repeat, we understood what we were doing and we had good reasons for doing what we did.
Reuse is good but attempting to reuse something that is not suitable for purpose is bad. HTTP is a very effective and reasonably efficient hypertext transfer protocol. It is also a pretty decent presentation layer, hence Web Services. HTTP/3 over QUIC is a very effective and very efficient hypertext transfer protocol. But it isn't as good at being a presentation layer as HTTP/1.1.
And that is OK
We don't need to use the same tool for every task. I did argue early on in HTTP/2 for some features that would make it better as a presentation layer but stopped when I realized what I wanted was HTTP/1.1 with most of the features stripped out. And that is better achieved as a separate protocol.