On vulnerabilities in open and closed source products

"Steven M. Christey" <coley@linus.mitre.org> · Tue, 26 Nov 2002 19:56:12 -0500 (EST)

Dave Aitel said:

>on Open Source platforms (or platforms for which the source code is so
>readily available as to make it open source in all but name) people
>are now hunting down obscure integer overflows, and on closed source
>platforms fuzzers are happily picking out stack overflows in initial
>handshake messages.

This phenomenon may be reflected in the types of vulnerabilities that
are reported in open vs. closed source advisories.  Below is a list of
"Top 10" vulnerabilities, and their relative ranking in open/closed
source advisories.  This is based on CVE statistics from January 2000
to October 2002.  These results were presented to the Open Source
Security Summit in Washington DC in October.

Interestingly, format string vulnerabilities and symlink issues are
much more frequently reported in open source vs. closed source
advisories.  Is this because closed source doesn't have the problem?
Probably not.  I suspect, rather, that symlinks and format strings are
more easily found using source code inspection than "black box"
testing.

As another example, many security advisories still report "buffer
overflows," but they are not yet using terminology that covers the
emerging variants of overflows, e.g. as we saw in the chunked encoding
problems that hit web servers, integer signedness errors, etc.  This
can make it difficult to know whether a product has a "classic"
overflow, or one of the newer flavors that may take the vendor some
time to completely stamp out.  (For historical precedent, consider how
format string vulnerabilities were often called "overflows" in the
early days).  This is especially problematic with vague advisories.

Many of the major vulnerabilities of this year aren't your classic
run-of-the-mill overflows, although they are labeled as such.

Consumers could use this top ten to evaluate how well their software
providers are performing... with the caveat that terminology is
imprecise.

Note: in general, I do not believe it is appropriate to compare the
security of products based on the number of released advisories.
There are too many variables including different risk tolerances,
vendors' willingness to publicly acknolwedge bugs, the use by vendors
of other channels besides advisories, and the fact that many
advisories can cover anywhere from 1 to 10 bugs.  In this case, we are
using reasonably normalized information (i.e. CVE names) to make the
comparison.

Thanks to Mark Cox of Red Hat Linux for suggestions that framed this
research in the context of open and closed source advisories.

- Steve

--------------------------------------------------
The Ten Most Commonly Reported Vulnerability Types
--------------------------------------------------
As of: October 2002

The overall rank and percentage is obtained from 3582 CVE entries or
CANs (CVE-xxxx or CAN-xxxx) from January 2000 to October 2002.  Note:
due to various factors, CVE is not necessarily complete for this time
period.

The open/closed source rank is obtained from 1184 CVE identifiers for
advisories from well-known open or closed source OS vendors,
approximately 600 each.

Overall                         Overall      Open Src.    Closed Src.
Rank      Flaw type             Percent      Rank         Rank
-------   --------------------  -------      ---------    -----------
1         Buffer overflow        21.8%          1             1
2         Directory Traversal     6.8%         11            14
3         "Malformed input"       5.8%          6             2
4         Shell Metacharacters    4.4%          5             7
5         Symlink Following       3.6%          2            10
6         Privilege Handling      3.5%          4             3
7         Cross-site scripting    3.1%          8            13
8         Cryptographic error     2.9%         13            11
9         Format strings          2.8%          3            12
10        Bad permissions         2.4%          7             5

Notes on Flaw Types
-------------------

These types have evolved over the past couple of years.  I am unaware
of vulnerability classification efforts that go down to this level of
abstraction (Krsul's work is close), but I am working on something
that uses a lower level of abstraction than these categories.

These flaw types focus on the programmer's error, not the type of
attack that is used to exploit the issue.

"Buffer overflow" covers most overflow-flavored issues, due to lack of
precise terminology in vulnerability reports.  It includes new
"flavors" of programmer errors that are lumped in with overflows,
e.g. causing an overflow by changing a length variable to misrepresent
the length of a data buffer.

"Directory traversal" covers all variants - "../" "..\" "%2e" "..."
etc. (but "/abs/path" is not included here).

"Malformed input" is a high-level type that covers illegally formatted
input.  This category is poorly understood and requires research (the
PROTOS project has made great strides in certain subclasses of
malformed input).  Advisories rarely provide the detail to precisely
understand how the input is malformed.  For example, consider all the
nmap/Spike scans that find *some* bug, but the bug is not diagnosed
fully enough to determine the exact type of input that caused the
problem.

"Privilege Handling" covers (a) when a process or function is assigned
higher privileges than it is supposed to have, or (b) when an attacker
can bypass authentication to access a privileged capability.

"Cross-site scripting" covers the injection of HTML or script into
either links or web pages (some people distinguish between the two).

"Cryptographic error" covers (a) insecure design [e.g. bad algorithm]
or (b) an incorrect implementation of a cryptographic algorithm.

"Bad permissions" covers when a program assigns insecure permissions
or access control to a file or directory, whether as a result of a
design choice or an implementation error.

Notes on Rankings
-----------------

Approximately 25% of the CVE data could not be classified because (a)
there was insufficient information to determine the type of
vulnerability, or (b) the vulnerability was unusual enough that it did
not map to an existing type.  Hopefully, item (b) will be addressed in
future work.

The rankings may reflect differences between open and closed source
vendors in terms of releasing advisories.  For example, some closed
source vendors may decide not to release advisories for locally
exploitable issues.

Buffer overflows continue to dominate the landscape.  This is probably
due to advances in exploitation techniques, the discovery of new
flavors, and the continued use of programming languages that are
subject to overflows.

Directory traversal vulnerabilities may be #2 because (a) there are so
many variants - some of them platform-specific - and (b) many
programming languages do not provide a clean way of "sandboxing" file
system access.  Their rank in open/closed source advisories might be
low because (a) directory traversal appears most often in web
applications, which are rarely "owned" by operating system vendors,
(b) web applications are easier to develop than entire servers, so
less skilled programmers may be making the bulk of the errors, (c)
major vendors are well aware of directory traversal issues, or
possibly a combination of all of them.

Malformed input errors may be more frequently reported in closed
source because of black box testing.  This *might* suggest that the
auditing of open source products relies more on source code analysis
than dynamic testing, but it's unclear.  This entire category is not
well understood, and I encourage enterprising researchers to study
this more closely.

Notice the relatively high percentage of symlink issues in open source
advisories.  This could be because it is fairly easy to find symlink
problems in source code using existing scanners, but there may be an
implicit bias in the data because some closed source vendors do not
run Unix.  However, a small handful of link issues have been reported
for products that run on Windows, so this may be under-researched in
the Windows world.

Format string vulnerabilities may have the same issues as symlinks:
they are pretty easy to find during source code review.  These are
probably harder to find via black box testing than symlink errors,
because format string issues often reside in error logging code, which
could be very difficult to trigger.  And there are tools available for
monitoring symlink creation, e.g. L0pht Watch.

Additional Information
----------------------

For additional CVE-related vulnerability statistics, see:

  http://cve.mitre.org/board/archives/2002-10/msg00005.html

This includes additional flaw types.

See the "secprog" mailing list (November 2002) for some extensive
discussions of the roles of programming languages in vulnerabilities:

  http://marc.theaimsgroup.com/?l=secprog&r=1&b=200211&w=2