RE: Best practice for data encoding?

Jeffrey Hutzelman <jhutz@xxxxxxx> · Tue, 06 Jun 2006 14:17:15 -0400

On Tuesday, June 06, 2006 10:33:30 AM -0700 Christian Huitema 
<huitema@xxxxxxxxxxxxxxxxxxxxx> wrote:

ASN.1 implementation bugs have also caused security problems for SSL,
Kerberos, ISAKMP, and probably others. These bugs are also not due to
shared code history: they turn up again and again.

Are there any other binary protocols that can be usefully compared
with
ASN.1's security history?

There is indeed a lot of complexity in ASN.1. At the root, ASN.1 is a
basic T-L-V encoding format, similar to what we see in multiple IETF
protocols. However, for various reasons, ASN.1 includes a number of
encoding choices that are as many occasions for programming errors:

To be pedantic, ASN.1 is what its name says it is - a notation.
The properties you go on to describe are those of BER; other encodings have 
other properties.  For example, DER adds constraints such that there are no 
longer multiple ways to encode the same thing.  Besides simplifying 
implementations, this also makes it possible to compare cryptographic 
hashes of DER-encoded data; X.509 and Kerberos both take advantage of this 
property.  PER eliminates many of the tags and lengths, and my 
understanding is that there is a set of rules for encoding ASN.1 data in 
XML.

* One can argue that SNMP makes a creative use of the "Object
Identifier" data type of ASN.1, but one also has to wonder why this data
type is specified in the language in the first place.

Well, I can't speak to the orignial motivation, but under BER, encoding the 
same sort of heirarchical name as a SEQUENCE OF INTEGER takes about three 
times the space the primitive type does, assuming most of the values are 
small.

Then there are MACRO definitions, VALUE specifications, and an even more
complex definition of extension capabilities. In short, ASN.1 is vastly
more complex that the average TLV encoding. The higher rate of errors is
thus not entirely surprising.

There certainly is a rich set of features (read: complexity) in both the 
ASN.1 syntax and its commonly-used encodings.  However, I don't think 
that's the real source of the problem.  There seem to be a lot of ad-hoc 
ASN.1 decoders out there that people have written as part of some other 
protocol, instead of using an off-the-shelf compiler/encoder/decoder; this 
duplication of effort and code is bound to lead to errors, especially when 
it is done with insufficient attention to the details of what is indeed a 
fairly complex encoding.

I also suspect that a number of the problems found have nothing to do with 
decoding ASN.1 specifically, and would have come up had other approaches 
been used.  For example, several of the problems cited earlier were buffer 
overflows found in code written well before the true impact of that problem 
was well understood.  These problems are more likely to be noticed and/or 
create vulnerabilities when they occur in things like ASN.1 decoders, or 
XDR decoders, or XML parsers, because that code tends to deal directly with 
untrusted input.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@xxxxxxx>
  Sr. Research Systems Programmer
  School of Computer Science - Research Computing Facility
  Carnegie Mellon University - Pittsburgh, PA

_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf