Re: [Last-Call] [dns-privacy] Review of draft-ietf-dprive-rfc7626-bis-03

"Martin Thomson" <mt@xxxxxxxxxxxxxx> · Thu, 02 Jan 2020 12:03:38 +1100

On Thu, Dec 19, 2019, at 02:06, Sara Dickinson wrote:
> To try to separate out the issue with the text in Section 3.5.1.1 I’ll 
> respond to the comments on that in a separate thread and try to address 
> the other issues in this email. 

Ack.  Ekr's answer will suffice for mine there.

> > BTW, what is "HTTPS destination IP address fingerprinting"? Was the intent of this paragraph to say that this document only examines the DNS protocol independent of the greater context in which it is used? That is, it looks at DNS without considering how privacy risks might result from the use of DNS in combination with other protocols?
> 
> It is describing the ability to fingerprint the website a user connects 
> to based just on the IP address of the HTTPS traffic. For example, this 
> paper given at ANRW https://dl.acm.org/authorize?N687437. Please 
> suggest text if you prefer a different description of this issue.

Given that I don't know what the intent of your statement is, I can't really suggest anything.  However, if the intent is to refer to the paper, then do so.  You are using what appears to be a term or proper name with the expectation that it is understood, but that clearly isn't the case.

> Suggest:
> 
> OLD:
> “The use of clear text transport options to decrease latency may also 
> identify a user e.g. using TCP Fast Open [RFC7413]."
> 
> NEW:
> “Note that even when using encrypted transports the use of clear text 
> transport options to decrease latency can provide correlation of a 
> users' connections e.g. using TCP Fast Open [RFC7413].”

I don't think that really addresses the central point, namely that these options trade linkability for performance and need to do so based on the client's policies with respect to linkability.  TFO is something of a bad example because it offers no prospects for confidentiality protection and it seems like it is not likely to be widely deployed (for that privacy reason and several others related to deployment challenges).

> NEW:
> “Implementations that support encrypted transports are also highly 
> likely to re-use sessions for multiple DNS queries to optimize 
> performance (e.g. via DNS pipelining or HTTPS multiplexing). Default 
> configuration options for encrypted transports could in principle 
> fingerprint a specific client application. For example:…
> 
> If libraries or applications offer user configuration of such options 
> (e.g. [stubby]) then they could in principle help to identify a 
> specific user.”

If you want the most superficial treatment of the issue, sure.  A proper treatment of the issue would require a more dramatic change.  For instance:

   These are cases where user identification, fingerprinting or
   correlations may be possible due to the use of certain transport
   layers or clear text/observable features.  These issues are not
   specific to DNS, but DNS traffic is susceptible to these attacks when
   using specific transports.

Could become:

   The manner in which endpoints implement different protocols might offer the ability for a network-based observer to correlate activity from the same implementation.  This is not regarded as a serious problem, as the resulting anonymity set is increases with the number of deployments of the same stack.  Furthermore, DNS usages might then share an anonymity set with other protocols.

   However, individualized customization of stack operation could enable fingerprinting.  Implementations that offer the ability to alter default options for the operation of the unprotected parts of a stack risk creating smaller anonymity sets.

I'll point out that all the discussion of altering HTTP behaviour for DoH, such as cookies, makes it less likely that DoH will look like something else.  If endpoints simply had a framework for understanding when linkability is and isn't tolerable, then this would be far less problematic.  I'll note that browsers already have this framework. That's why you don't see browser people in the group of concerned citizens.  We all know that the framework is terrible in certain ways, but we understand its limitations and they are an area of active work.

> > Section 3.5.1.2
> RFC7626 included Section 2.5.3 
> https://tools.ietf.org/html/rfc7626#section-2.5.3 ‘Rogue Servers’. This 
> section is just an update of that text to improve context and remove 
> the phrase ’rogue server’. Since the majority of OS implementations 
> still use these mechanisms today it seems to still be relevant. 

Oh, I see that now.  Yeah, 7626s2.5.3 makes one very important point: it acknowledges the possibility for a server to abuse its privileged position.    I failed to link this to that text because this section only contains the other bits.

Those other bits are just a re-iteration of the failings of various discovery methods.  I think you will find that these limitations are acknowledged in the specifications for those protocols.  They aren't in the threat model for those protocols.

The point is that we're trusting the network to provide this service when we have no strong basis for trusting that the network is able to do so.

So maybe the answer here is to say:

  Stub resolvers that discover a resolver identity using the network are trusting the network to both operate a recursive resolver and to secure the discovery process.  That is, in addition to exposing queries to the network operator, vulnerabilities in the discovery process might allow an attacker to interpose their own resolver.  Note that DHCP assumes that the network provides certain safeguards; see Section 22 of [RFC8415].  [[include examples here]]

  Failing to authenticate and authorize a recursive resolver also exposes stub resolvers to the possibility of attack; see Section 3.5.1.4. Automatic discovery without any prior expectations about the identity of allowed resolver makes authorization impossible.

(inline text on authorization as you suggested.)

I would point out that the citation for dnschanger violates the standard assumptions in RFC 3514, so I wouldn't rely on that so much.  The ARP/NDP examples are in direct contradiction to the additional assumptions that DHCP makes.

> > Section 3.5.1.3
> NEW:
> “ Many network operators argue that they block access to remote 
> resolvers for security reasons, for example to cripple malware and bots 
> or to prevent data exfiltration methods that use encrypted DNS 
> communications as transport. Further discussion of Internet service 
> blocking and filtering can be found in [RFC7754]."

How about avoiding "argue that" and the associated rationale (your choice of words here has the unfortunate effect of promulgating an argument about efficacy that is almost certainly true today, but still steps into the debate) and instead stick to facts:

   As a matter of policy, some recursive resolvers use their position in the query path to selectively block access to certain records.  This is a form of Rendezvous-Based Blocking as described in Section 4.3 of [RFC7754].  In order to prevent circumvention of their blocking policies, some networks also block access to resolvers with incompatible policies.

FWIW, the title of this section might be problematic.  This isn't "user-selected", especially in the malware case.

> > Section 3.5.1.5.1
> > 
> > The arguments here repeat those from Section 3.4.2 (nit: not 3.4 as stated). A section reference would be enough.
> 
> I’ll update the section reference and remove the last sentence and the 
> two bullet points. 
> 
> > 
> > Section 3.5.1.5.2
> NEW:
> “Users should be aware that the particular choice of HTTPS 
> functionality vs data minimisation (for example, whether to include the 
> user-agent header) is an implementation specific choice in DoH, not one 
> defined in RFC8484.”

Who is the audience for this document again?

That doesn't really address my more general concerns.  For instance, I might take offense at:

>  the wide practice in HTTP to use various headers to optimize HTTP connections, functionality and behaviour (which can facilitate user identification and tracking)

on the basis that it assumes that these optimizations are deployed without regard to privacy.

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call