Re: Protocol design: the Gemini project

Phillip Hallam-Baker <phill@xxxxxxxxxxxxxxx> · Tue, 1 Dec 2020 19:25:46 -0500

On Tue, Dec 1, 2020 at 12:40 PM Keith Moore <moore@xxxxxxxxxxxxxxxxxxxx> wrote:

    On 12/1/20 11:40 AM, Phillip Hallam-Baker wrote:

      Of course we
        could have stuck to vinyl. But FTP is more like grandad's 78s.
        There is absolutely nothing to recommend FTP over HTTP. Rsync is
        vastly superior for file transfer.

    False, on multiple levels.
    The biggest flaw of HTTP for file transfer (if one wants to
      transfer more than one file) is that HTTP doesn't have a built-in
      way to list files, distinguish files from directories from other
      kinds of nodes, and walk a file system.

      Rsync, certainly at the time that HTTP was designed (it may have
      improved since then) had a LOT of overhead because it tried to
      analyze each file for changes within the file, minimizing
      bandwidth used (which to be fair, was quite scarce) at a cost of
      CPU time and latency.  Circa 1993 I looked at using rsync to
      replicate a web site to multiple locations (early CDN I suppose)
      and found it completely inadequate.

    Of course FTP was designed for file transfer, including between
      dissimilar systems (which were very common in ARPAnet days), and
      HTTP wasn't designed for that purpose.  There was nothing wrong
      with designing a new protocol for the web especially since the web
      had different needs, different assumptions, and operated under
      different conditions.  

    But the web has NEVER been a good way to do file transfer. 
      Wasn't in 1991, and isn't today.   And the protocol designed for
      the web isn't either without adding some additional features.
The original attack made was that HTTP was unnecessary because FTP existed. So the fact that FTP can do other things is utterly irrelevant to what I was responding to. We wrote a new protocol because FTP is a really shitty hypertext transfer protocol. That wasn't an arrogant or ignorant approach, libwww always had an FTP client in it.

On Tue, Dec 1, 2020 at 1:27 PM Joseph Touch <touch@xxxxxxxxxxxxxx> wrote:
Speaking of false...

Of course I have the benefit of knowing what I am talking about having actually tried to make FTP work. And as of 1993 it simply didn't as you would know had you tried it.

On Dec 1, 2020, at 8:40 AM, Phillip Hallam-Baker <phill@xxxxxxxxxxxxxxx> wrote:

...
Having (re)implemented FTP for LIBWWW, I can assure you that it was done for very good reason.

FTP is not really an independent protocol. It is actually designed as a feature add on for Telnet. As a result, FTP is vastly less efficient than HTTP because you have two socket creations and teardowns per connection. 

FTP needs only N+1 connections per set of N transfers from a given site.

Not in 1993 it didn't. I took a look at the practicality of caching the connections for multiple transfers from the same site and it plain didn't work because of all sorts of technical issues such as socket exhaustion and the limited capabilities of the FTP servers of the era. At the time HTTP was written, HTML didn't support any form of transclusion so it was one transfer per page.

The problem with caching the connection is that the FTP servers of the era were typically limited to 8 simultaneous users. So if browsers kept the control connection open, they would prevent other users connecting. Sure you could rewrite the FTP server but why do that when you can write a new server with less effort?

Oh and that is before we start to look at CGI etc. 

The gap between the theory as you understand it and implementation at the time was huge.

I repeat, we understood what we were doing and we had good reasons for doing what we did.

Reuse is good but attempting to reuse something that is not suitable for purpose is bad. HTTP is a very effective and reasonably efficient hypertext transfer protocol. It is also a pretty decent presentation layer, hence Web Services. HTTP/3 over QUIC is a very effective and very efficient hypertext transfer protocol. But it isn't as good at being a presentation layer as HTTP/1.1.

And that is OK

We don't need to use the same tool for every task. I did argue early on in HTTP/2 for some features that would make it better as a presentation layer but stopped when I realized what I wanted was HTTP/1.1 with most of the features stripped out. And that is better achieved as a separate protocol.