RE: Call for Community Feedback: Retiring IETF FTP Service

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Toerless!

Thanks for the detailed feedback.

> -----Original Message-----
> From: Toerless Eckert <tte@xxxxxxxxx>
> Sent: Tuesday, November 10, 2020 5:55 PM
> To: Roman Danyliw <rdd@xxxxxxxx>
> Cc: ietf@xxxxxxxx
> Subject: Re: Call for Community Feedback: Retiring IETF FTP Service
> 
> a) I don't think that "small community" is a sufficient analysis to
>    justify retirement of ftp. If this small community for example would be
>    significant mirrors or give access to the content to users who for
>    whatever reasons can not access the content differently, i am sure that
>    would influence the decision making.

Absolutely, if we were convinced that FTP provided unique access that might change the calculus.  However, this seems like a hypothetical not backed by the data and infrastructure we have right now.

As noted in the proposal, ALL data is available at least two other ways (https://www.ietf.org/ietf-ftp or rsync).  By request volume, HTTPS is massively preferred (FTP is 0.2% of HTTPS document traffic, and this is undercounting HTTPS usage) by the vast majority of users.  If the bulk download semantics are desired, to include incremental updates with no code required, then rsync is the best choice.  Do you have a user community in mind that can use FTP, but not HTTPS or rsync that we need to consider?

Per the usage data [1], the 85th percentile of traffic comes from entities that don't strongly suggest they would mirror for unique access:

** a dynamic IP address in a German ISP
** the proxy of a Fortune 100 company
** a Canadian IT services company
** a large US search engine company
** a leading Japanese research university ++
** website of a not-so-popular programming language
** a small Swedish software product company
** a small, several person US consulting company

++ maybe this one, but a cursory look didn't find an FTP server making that content available or a mirror of it served by HTTP 

> b) I don't know what "operational complexity" is supposed to mean.
>    Cost of operations would be a more useful measure to weigh against
>    the benefit, especially when its hard to quantify the benefit, given
>    how (see a) the benefit is not necessarily equal to the percentage of
>    utilization - and the cost likely is very small.

Running each services has a cost. In the case of FTP, there is of course the FTP service itself.  On the backend, to make all of the data replicate in such a way that exposes it correctly to FTP requires custom tooling that must be maintained.

You are exactly right about the benefit being a key consideration.  Usage appears to be extremely low which is why the community is being consulted.  That said, the usage is not zero.

I won't repeat the usage summary from the proposal or [1].
 
> c) The PDF and your email is not really honest, because it says "http",
>    but in reality every http URL is immediately redirected to https, aka:
>    retiring ftp would further strengthen the policy USER MUST USE END
>    TO END ENCRYPTION WHETHER THEY WANT TO OR NOT.
> 
>    I understand that this is the implied policy by security advocacy in the
>    IETF, but i think it is not the right policy for all content. I think
>    end-to-end encryption should be a choice. Of course, a content provider
>    like IETF could make that choice and force it upon users, but for example
>    for bulk download of public data it often is a quite performance reducing
>    choice.

Noted.  The text could have been more precise as HTTPS.  I'd prefer not to relitigate what was decided 5 years ago (https://www.ietf.org/about/groups/iesg/statements/maximizing-encrypted-access/).

>    Aka: https downloads are slower than ftp/rsync (rsync without ssh!).

Given that >99% of all traffic is bulk downloads [1] and in the grand scheme of things pretty low volumes, can you clarify the use case in which the performance difference between ftp vs. https practically matters.

>    Encrypted download might also not be permitted by users operating
>    from secured environments where all actions need to be logged. Aka:
>    I think IETF should have the option for unencrypted access to its
>    data for all popular protocols and leave the choice to the users.

Can you say more about these secured environments?  Those that I am familiar with would not allow insecure protocols like FTP, would be air-gapped not allowing any internet access, or would heavily filter what sites on the internet with which there is contact (and ietf.org doesn't seem like one to allow)

>From the data we have [1], it doesn't seem like many users are operating in a restricted environment.  I'm making assumptions here, but the biggest users of FTP, constituting 85% of the traffic, all seem very capable of consuming encrypted content.
** a dynamic IP address in a German ISP
** the proxy of a Fortune 100 company
** a Canadian IT services company
** a large US search engine company
** a leading Japanese research university
** website of a not-so-popular programming language
** a small Swedish software product company
** a small, several person US consulting company

I stopped counting at 38, but that's the number of addresses in a dynamic IP address range of a North American or European ISP, or at a university.  Those seem like safe bets for having unrestricted HTTPS access to reach ietf.org.  Combine that 38 + the 8 above gets one reasonable confident that at least half of the 91 users aren't in these restricted environments.  The numbers likely much higher.

Given the number of replicas on the internet of the I-Ds and RFCs, if there were many such disadvantaged users, couldn't they go there?

>    Of course, this would be less a question of protocol if IETF would not
>    have forced MUST ENCRYPT onto TLS 1.3, because no-encrypt,authenticate-
> only
>    would be a perfect option for these type of use-cases when accessing public
>    data without requring privacy. Oh well...
> 
> d) HTTP/HTTPS are not protocols that actually allow to search/find content,
>    like FTP is with its ability to list directories. For repositories like what
>    was/is on ftp, HTTP/HTTPS are severely inferior protocols. Directory listing
>    through per-client implemented screen scrping of directory listing output
>    format of few well-known web-server directory listing formats. And IETF
>    seems to be endorsing this. Give me a break. Where is INT area when you
>    need it ? ;-)) Do we even specify a standard format for IETF HTTP/HTTP
>    directory listing format, or do all mirroring script clients need to be
>    updated when/if the IETF web server would some time decide to change
>    implementations and use a different directory listing output format ?
>
>    WebDAV ? Have not seen this in operation on any browser to give me
>    better, FTP like experience on web file stores...

No argument on your observation.  

>From the data we have in [1], there is little evidence of "search/find" happening, if at all on our FTP service.  Consider the more detailed analysis as [1].

"Search/find activity" would likely not look like the access patterns of:
** Search engine
** Sync of the repo
** Incremental sync of the repo
** Polling a single file
** Polling the status files
** Single file access

This leaves only the "unknown".  Practically that is 12 users who collectively made 60 files access.  Two of those users downloaded the same document twice, to me that doesn't say discovery.  That leaves 10 users with 58 files accesses:

** user 1: 11 file accesses
** user 2 - 4: 8 file accesses
** user 5: 5 file accesses
** user 6-7: 4 file accesses
** user 8-9: 3 file accesses
** user 10: 2 file accesses

This is pretty light "search and find" usage if that is it at all.  For perspective, this is 11% of FTP users, 0.1% of all FTP traffic and 0.001% of the traffic compared to HTTPS.

> e) rsync would survive as the only viable bulk update mechanism, and for
>    that one may argue its better than FTP. I am a fan. But not sure
>    that means it can replace ftp in all use-cases.

Completely agree.  It appears from the analysis that at least 86% (from [1], "full sync" and "incremental sync") of all requests could be satisfied by rsync.  Turning off FTP will also eliminate the 12% of requests that come from search engine traffic.  This leaves the use cases of the following, which if scripted would require code changes:

Polling single file (0.1% of FTP traffic) -- ready means to implement with HTTPS

Polling status file(s) (0.3% of FTP traffic) -- ready means to implement with HTTPS

Single file access (0.1% of FTP traffic, 37 requests) --  don't understand enough to say, perhaps inbound links for a ftp:// which can't be changed

Unknown (0.2% of FTP traffic, 60 requests) -- don't understand enough to say

>    The IETF documentation for rsync
>    is quite lacking. I would stroongly suggest to put better documentation,
>    on a https://rsync.tools.ietf.org page so users for whom https is not the
>    best choice can quickly get up to speed.

Point taken.  In fairness, ~86% of these requests (i.e., those that sync I-Ds and RFCs) would be handled by this documentations which covers rsync:

https://www.ietf.org/standards/ids/internet-draft-mirror-sites/

>    Also: before removing ftp.ietf.org, i would STRONGLY suggest to simply
>    remove all content, but keep the web service alive with a forward pointer,
>    (README) e.g.: to http://rsync.tools.ietf.org - and let that run for at least
>    one year (or longer). That way, all owners of scripts could inf the
>    forward information.

Noted as an additional retirement transition approach.  From everything I've heard from two script owners who have spoken up, this unfortunately wouldn't help them.  Although using rsync might.

> f) Q: Was this type of question raised before investing all the effort
>    into the analysis of the situation and what looks like almost finished
>    decision making ? i can't remember and earlier Q to the community...

Nothing is finished.  This more detailed plan was prepared in response to many of the questions left unanswered in the 2015 discussion -- how much is FTP being used?  Is access to information going to be lost? what about stable references in documents?  Anticipating these same questions would be asked, this plan was prepared.

Regards,
Roman

[1] https://mailarchive.ietf.org/arch/msg/ietf/bEIgcOFMA73s5BG_vZppQsUZ2eU/

> Cheers
>     Toerless
> 
> On Tue, Nov 10, 2020 at 02:23:58AM +0000, Roman Danyliw wrote:
> > Hi!
> >
> > The Internet Engineering Steering Group (IESG) is seeking community input on
> retiring the IETF FTP service (ftp://ftp.ietf.org, ftp://ops.ietf.org, ftp://ietf.org).
> A review of this service has found that FTP appears to serve a very small
> community and HTTP has become the access mechanism of choice.  Given this
> shift in community usage, reducing the operational complexity of the overall
> IETF infrastructure seems to outweigh the very limited community served with
> FTP.
> >
> > In reviewing the additional impacts of such a service retirement, the
> dependencies on FTP have been assessed.  Additionally, it has been confirmed
> that all information currently reachable through FTP will continue to be
> available through other services (HTTP, RSYNC, IMAP).
> >
> > In consultation with the Tools team (Robert, Glen, Henrik, Russ, and Alexey),
> Communications team (Greg), affected SDO liaisons, IAB Chair, and LLC ED, a
> proposed retirement plan was developed and is available at:
> >
> > https://www.ietf.org/media/documents/Retiring_IETF_FTP_Service.pdf
> >
> > The IESG appreciates any input from the community on this proposal and will
> consider all input received by December 4, 2020 (to account for the upcoming
> IETF 109 and holidays).
> >
> > Regards,
> > Roman
> > (as the IESG Tools Liaison)
> 
> --
> ---
> tte@xxxxxxxxx





[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux