Re: [dnsext] Last Call: draft-ietf-dnsext-forgery-resilience (Measures for making DNS more resilient against forged answers) to Proposed Standard

Doug Otis <doug.mtview@xxxxxxxxx> · Fri, 10 Oct 2008 13:48:44 -0700

On Oct 9, 2008, at 10:48 AM, Nicholas Weaver wrote:

On Oct 9, 2008, at 9:52 AM, Ólafur Guðmundsson /DNSEXT chair wrote:
At 19:17 02/10/2008, Nicholas Weaver wrote:
I believe this draft is insufficient:

4.1:  Frankly speaking, with all the mechanisms out there, you  
must assume that an attacker can force queries of the attacker's  
choosing at times of the attacker's choosing, within a fraction of  
a second in almost all cases.  This is not by directly generating  
a request to the server, but by performing some application  
operation (eg, mail request where the SMTP resolver checks it, a  
Javascript loaded in a victim's browser, etc.  Preventing external  
recursive queries is about as an effective defense as a chalk line  
on the ground saying "do not cross".

While this is true in general, there are number of situations where  
not allowing "out-side" queries helps.
For example:
-  Sites that run separate DNS recursive resolvers/caches  for mail  
servers than are used by user population in general.
-  Sites that are able to keep malware off internal hosts limit the  
ability of attackers to issue queries from the inside.

Sites that don't allow their users to visit ANY external web  
sites...  An attacker who can get a user to visit a web site of the  
attacker's choosing can cause the DNS resolvers to issue an  
arbitrary stream of requests, EVEN without javascript, just iframes  
and redirections.

When developing a defensive posture for a DNS resolver, you MUST (in  
the strict standard sense, capitalization is deliberate!) assume the  
attacker can generate an arbitrary request stream: there are simply  
too many vectors which an attacker can do so to consider it  
plausible to shut them down.

Although it may be a good idea for other reasons to prevent external  
use of your institution's recursive resolver, the net security gain  
is negligible.

Agreed.  An MUA or MTA might be used to autonomously generate DNS  
queries to both evil and victim name servers.  There are MUA plugins  
that will also attempt to resolve email path registration records with  
a sizable series of DNS transactions for A, AAAA, or MX records.  The  
script-like resolution process can be directed by a cached resource  
record, where subsequent transactions can be modulated by a macro  
employing the email-address of originator's local-part (MAIL-FROM or  
PRA).  A sequence of 100 transactions can be controlled by an attacker  
that are directed to both evil and victim name servers where neither  
are directly referenced by the message.  The sequencing process could  
be instantiated by a routine that is bound to a UDP port, or that  
navigates through a NAT where port diversity might be greatly  
reduced.  Both the poisoning timing and the opportunity would be  
afforded by spam while consuming little of the attackers resources,  
beyond the additional poison.  This could poison the cache used by  
user population as well as the cache used by the MTA.  This approach  
offers the attack two advantages, ample opportunity while also  
overwhelming the authoritative name server.  : (

Section 4 also excludes two significant additional entropy sources  
which can't always be used but can often be used: 0x20 matching  
and duplication, since ~32b entropy is only marginally sufficient,  
but 40b + (achievable through 0x20 and/or duplication) is for  
protecting the cache.

After some discussion in the WG this was explicitly ruled outside  
the scope of the document. For the record most of the attacks I'm  
seeing are trying to spoof the root and TLD's using "mostly  
numeric" domain names, in these cases x20 provides limited defense.

However, those attacks are only able to work because of the race- 
until-win properties.

The TTLs for these records are relatively long, thus without race  
until win, the attack window is very narrow.

Where 0x20 provides protection is on names with short/0 TTL, which  
includes many rather important sites (googlemail.l.google.com,  
login.yahoo.akadns.net) which also happen to have rather long names.

DKIM provides an interesting feature.  It uses arbitrary sub-domain  
names to reference public keys.  Once an attack detects success  
through a number of techniques, poisonous keys can then validate  
signed email sent to any recipient using the same resolver.

More important I think is the lack of discussing the semantics of  
duplication.  Duplication is a really powerful tool for increasing  
entropy if you believe 32b of entropy is insufficient and there  
isn't enough additional entropy obtained by 0x20.

There are too many situations, such as the one described, where it  
would be wrong to assume there is 32b of entropy, where a few case  
bits would become fairly significant.  Detecting a possible attack for  
a specific resource (or its hash) and checking for two matching  
answers seems like a better long term solution.

Additionally, since you CAN detect that you are under a generic  
attack if you have at least 32b of entropy (just count unsolicited  
responses), you can respond with duplication to effectively double  
your entropy to 64b during the period in which you are under attack.

The attack detection would need to resolve against each resource  
separately, or this could double the effect of a path registration  
DDoS attack.

Thus if the document EXCLUDES 0x20 and duplication, it should  
explicitly state something to the effect:  There are additional  
mechanisms which can be used to increase query entropy in most (but  
not necessarily all) cases, such as 0x20 randomization [cite] and  
packet duplication.  They are deliberately excluded from this  
document for reasons X,Y,Z.

Agreed, with the proviso that the detection technique limits  
duplication to that of the attacked resource.

Likewise, section 7-8 explicitly ignore the effects of "race until  
win".  As long as a resolver will accept any additional data from  
a result into the cache, even when in scope (section 6's  
recommendations enable race-until-win attacks), TTL becomes 0  
regardless of the actual TTL of the record.  This is the real  
power of the Kaminski attack: it is not a reduction in packet-work  
for the attacker, but a reduction in time.

This is also in the 'data admission' category not the 'packet  
acceptance' but it supports one of my favorite saying:  
"Optimization is the root cause of most problems". In this case  
people were hoping to decrease the number of queries by learning by  
a side effect.

Then there needs to be an explicit reference of the sort:

These equations depend not only on the packet acceptance policy, but  
also the data admission policy.  Unless data admission policies are  
changed to be significantly more restrictive than the current  
standard specifies, race-until-win attacks are possible where an  
attacker can keep attempting to poison a target name regardless of  
TTL.  As long as race-until-win attacks are possible, one MUST  
assume that TTL = W for all defensive calculations.

There is also an assumption that there is duplicate-outstanding- 
request supression (no birthday attacks), which I don't believe is  
explicitly stated.

Agreed.

-Doug

_______________________________________________

Ietf@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf