Re: DHCID and the use of MD5 [Re: Last Call: 'Resolution of FQDN Conflicts among DHCP Clients' to Proposed Standard]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In message <Pine.LNX.4.64.0511261615210.26558@xxxxxxxxxx>, Pekka Savola writes:
>Hi,
>
>I'll break out the most substantial comments in separate messages..
>
>On Mon, 14 Nov 2005, The IESG wrote:
>> The IESG has received a request from the Dynamic Host Configuration WG to
>> consider the following documents:
>>
>> - 'A DNS RR for Encoding DHCP Information (DHCID RR) '
>>   <draft-ietf-dnsext-dhcid-rr-10.txt> as a Proposed Standard
>> - 'Resolution of FQDN Conflicts among DHCP Clients '
>>   <draft-ietf-dhc-ddns-resolution-10.txt> as a Proposed Standard
>> - 'The DHCP Client FQDN Option '
>>   <draft-ietf-dhc-fqdn-option-11.txt> as a Proposed Standard
>> - 'The DHCPv6 Client FQDN Option '
>>   <draft-ietf-dhc-dhcpv6-fqdn-03.txt> as a Proposed Standard
>
>I have only one major comment on DHCID on its use of MD5 as a 
>glued-in hash-function.  The rest of the comments are rather 
>straightforward.
>
>substantial
>----------
>
>    In order to avoid exposing potentially sensitive identifying
>    information, the data stored is the result of a one-way MD5 [5] hash
>    computation.  The hash includes information from the DHCP client's
>    REQUEST message as well as the domain name itself, so that the data
>    stored in the DHCID RR will be dependent on both the client
>    identification used in the DHCP protocol interaction and the domain
>    name.  This means that the DHCID RDATA will vary if a single client
>    is associated over time with more than one name.  This makes it
>    difficult to 'track' a client as it is associated with various domain
>    names.
>
>    The MD5 hash algorithm has been shown to be weaker than the SHA-1
>    algorithm; it could therefore be argued that SHA-1 is a better
>    choice.  However, SHA-1 is significantly slower than MD5.  A
>    successful attack of MD5's weakness does not reveal the original data
>    that was used to generate the signature, but rather provides a new
>    set of input data that will produce the same signature.  Because we
>    are using the MD5 hash to conceal the original data, the fact that an
>    attacker could produce a different plaintext resulting in the same
>    MD5 output is not significant concern.
>
>==> while the informatione exposure of someone cracking the MD5 hash 
>is not too huge, I believe it is unacceptable to design new protocols 
>without the capability to switch the hash function as need be.  This 
>could be achieved for example by reserving one additional byte from 
>the start of the DHCID record to designate the hash function used. 
>If you don't bother to define your own registry (for all of me, you 
>could include MD5 there as well, but at least include SHA1 and 
>preferably also SHA-256), you could possibly re-use 
>http://www.iana.org/assignments/ds-rr-types or something like that.
>
>That way, we can introduce new hash functions in a backward compatible 
>manner later on, with no need to revamp the protocol.
>
>If we don't do this, we'll need to define DHCID2, DHCID3, .. etc. 
>records further down in the future (w/ different hash functions) and 
>make DHCP co-exist with all of them.  That's bound to cause a lot of 
>protocol complexity, and I don't think we want to go there.

I agree with this comment.  The draft is wrong -- it asserts that a
"successful attack of MD5's weakness does not reveal the original data".
That's an overassumption -- we have no idea what such an attack would 
yield, since no such attack currently exists.

More generally...  The currently-known attacks on MD5 are collision 
attacks: it's possible to generate two inputs that produce the same 
hash value.  This scenario requires a preimage attack; none are known.
It would not surprise me if someone were to develop one, but until that 
happens we can't speculate on its properties.  There are, however, some 
reasons for concern.  One of the options defined, the DHCPv4 Client 
Identifier, probably doesn't have much entropy.  For example, a 
suggestion in RFC 2132 says to use the ARP hardware type code and MAC 
address.  There's exactly one interesting hardware type code for most
users, and the high-order 3 bytes of the MAC address are the 
manufacturer's ID, not many of which are actually used.  Given that 
this is an 8-byte input string and that MD5 has an 8-byte output, it is 
plausible that comparatively few input strings hash to any given output.
If several of the input bytes are fixed, or at least constrained, there 
may be only one.  For that matter, that assumption alone may lead to a 
successful attack on MD5. 

In fact, the Security Considerations section should analyze the 
(non-trivial) probability of a brute-force attack.  Again, consider the 
Client Identifier, which is likely 8 bytes long.  2 are fixed, and 
hence irrelevant.  According to today's copy of
http://standards.ieee.org/regauth/oui/oui.txt there are 8786 
manufacturer IDs, or slightly more than 8 bits.  Effectively, though, 
it's less, since the usage is very non-uniform.  Even if is uniform, 
though, that field plus the unit identifier only total slightly over 32 
bits -- well within anyone's capabilities.

Most of this analysis applies to the other two options as well.

		--Steven M. Bellovin, http://www.cs.columbia.edu/~smb



_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]