Re: Quic: the elephant in the room

"Andrew McConachie" <andrew@xxxxxxxxx> · Mon, 12 Apr 2021 12:36:43 +0200

On 11 Apr 2021, at 20:13, Michael Thomas wrote:

On 4/11/21 10:23 AM, Salz, Rich wrote:

  * I don't see why [DNS timeouts] it can't be long lived, but even
    normal TTL's would get amortized over a lot of connections. Right
    now with certs it is a 5 message affair which cannot get better.
    But that is why one of $BROWSERVENDORS doing an experiment would
    be helpful.

There are use-cases where a five-second DNS TTL is important.  And 
they’re not amortized over multiple connections from **one** user, 
but rather affect **many** users.  Imagine an e-commerce site 
connected to two CDN’s who needs to switch.

The worst case is that it devolves into what we already have: 5 
messages assuming NS records are cached normally.

Another approach using current infrastructure would be for the client 
to cache the certs and hand the server cert the fingerprint(s) in the 
ClientHello and the server sends down the chosen cert's fingerprint 
instead of the cert which could get it back to 3 messages too. That 
would require hacking on TLS though (assuming that somebody hasn't 
already thought of this). That has the upside is that it's the server 
chooses whether it wants to use the cached version or not too.

For the past 2 years or so I’ve been performing HTTPS DANE validation 
on all TLS 1.1 and 1.2 connections egressing my house. I use custom code 
written for OpenWRT that installs ACLs to block TLS connections that 
fail DANE validation.

More info here.
<https://www.middlebox-dane.org/>

The short story is that it works pretty well. I usually forget I have it 
running, and on the off chance I hit a website with invalid TLSA RRs 
that gets blocked it usually takes me a few minutes to try and figure 
out why the website isn’t loading.

One thing I’ve learned from this experiment is that, atleast in my 
situation, DNS is always faster than any TLS setup time that matters. 
The web is horrendously slow, so even if TLS setup completes my OpenWRT 
box will kill the TLS connection long before any real data gets 
transferred. These things happen asynchronously, and I have Unbound 
running directly on the OpenWRT box, so DNS always wins. But even if the 
TLS handshake were to complete fully and only then get killed by OpenWRT 
it would be the same outcome.

When looking at how one might implement DANE for HTTPS/TLS I don’t see 
any reason to handle these things sequentially. You don’t have to 
change TLS you just have to do things asynchronously. Query for TLSA RRs 
at the same time as sending the TLS ClientHello, and kill the connection 
setup when/if DANE validation fails. On the off chance that the DNS 
actually takes longer than TLS, maybe delay sending data via TLS until 
DNS responds. But I bet this almost never happens.

If you really don’t want any slow down from DANE just setup TLS 
normally and only kill it once DANE validation fails. The web is so slow 
that the chances something significant will happen prior to the client 
fetching the TLSA RR and performing DANE validation are practically nil.

Thanks,
Andrew