Re: [Last-Call] Opsdir last call review of draft-ietf-quic-manageability-14

Lucas Pardue <lucaspardue.24.7@xxxxxxxxx> · Mon, 7 Feb 2022 16:24:51 +0000

Hey,

On the topic of fallback or failover or something else. My 2c with no hats.

On Mon, 7 Feb 2022, 15:36 Spencer Dawkins at IETF, <spencerdawkins.ietf@xxxxxxxxx> wrote:
I haven't seen anyone else respond to Al's email on this point, so I thought I'd share an opinion. 

On Sat, Feb 5, 2022 at 4:12 PM Al Morton via Datatracker <noreply@xxxxxxxx> wrote:
Reviewer: Al Morton

Review result: Has Issues

Hi Mirja and Brian,

This is the OPSDIR review of

              Manageability of the QUIC Transport Protocol

                    draft-ietf-quic-manageability-14

Snip, down to 

4.6.  UDP Blocking, Throttling, and NAT Binding

...

   Further, if UDP traffic is desired to be throttled, it is recommended

   to block individual QUIC flows entirely rather than dropping packets

   indiscriminately.  When the handshake is blocked, QUIC-capable

   applications may fail over to TCP.  However, blocking a random

[acm]

is "fail over" or "fallback" the preferred term?

(using only one will help)

   fraction of QUIC packets across 4-tuples will allow many QUIC

   handshakes to complete, preventing a TCP failover, but these

[acm] ... or "failover" preferred?

   connections will suffer from severe packet loss (see also

   Section 4.5).  Therefore, UDP throttling should be realized by per-

   flow policing, as opposed to per-packet policing.  Note that this

   per-flow policing should be stateless to avoid problems with stateful

   treatment of QUIC flows (see Section 4.2), for example blocking a

   portion of the space of values of a hash function over the addresses

   and ports in the UDP datagram.  While QUIC endpoints are often able

   to survive address changes, e.g. by NAT rebindings, blocking a

   portion of the traffic based on 5-tuple hashing increases the risk of

   black-holing an active connection when the address changes.

In my mind, 
"fallback" makes more sense if we are talking about falling back within a single protocol (for example, attempting to use an extension, discovering that the other host doesn't support that extension, and retrying without the extension - or,, also within a single protocol, attempting to use version 9, discovering that the other host doesn't support that extension, and retrying with a different version), and 
"failover" makes more sense if we are talking about starting with one protocol (QUIC, in this case) and if that doesn't work, switching to a different protocol (TCP, in this case). 
I know we've used both terms somewhat interchangeably during discussions about QUIC (and not just discussions about this document), but if one term is to be chosen (Al's suggestion, which I agree with), I think what we're talking about here is "failover". 

Other people may have thoughts, of course. 

To me this is really an implementation detail. 

In HTTP/3's case you might learn of a it via Alt-Svc over TCP and try yo swtich to it. That could fail and you might "fallback" or really just ignored a suggestion that turned out to be bogus. Nothing fell over or failed except an opportunistic connection attempt.

Now add in the alternative discovery mechanism of SVCB and HTTPS record. Clients would learn the availability of multiple HTTP versions and are likely to implement different strategies for selecting on or the other. Imagine a client which learns that on some kinds of network they block QUIC. Maybe it would just stop bothering to try QUIC on that network despite any other information available.

There are, of course, other application protocols that run over QUIC. Those things might have no TCP equivalent. The network operators should not be trying to introspect such things as application mapping.

All of these things are examples where client agency gives no hope for others to guess their behaviour. And that's fine. IMO it suffices to say that a clients can switch to any transport, at any time, for any reason. That is neither failover or fallback, just choice.

Trying to diagnose network issues based on choice is futile and, if done naively, could false diagnose problems that do not exist.

Cheers
Lucas

Best,

Spencer

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call