Re: What to move to?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/15/2013 09:04 PM, Björn Persson wrote:
Miloslav Trmač wrote:
The logical conclusion from this is to move to a language with automatic
memory management.  The "top vulnerability" reports for programs written in
C/C++ and most other languages so different that starting a new project
that processes untrusted data in C/C++ is becoming indefensible.

If by "automatic memory management" you mean garbage collection, then
that's not really what we need. Garbage collection has advantages, but
what is needed to stop the buffer overflows is bounds checking. The
compiler needs to keep track of how big each object is and insert code
to check that writes to an array stay within the bounds of the array.

There's also the issue of dangling pointers (pointers which point to a memory location which now holds an object of a different type). They can result from misapplied memory management, or from type safety loopholes in the language definition. An example for Ada is here:

  <http://www.enyo.de/fw/notes/ada-type-safety.html>

(See the postscript—this was already known in the Ada 83 days. I still find it remarkable. It's possible to work around this in a GC-based implementation.)

Now, what to move to?  I currently don't have see any language/runtime I
could recommend, which is in itself rather frightening.

I recommend Ada. Ada does bounds checking, and is compiled to machine
code with performance comparable to C.

Yes, Ada has some nice features. At least there are real arrays, but they are somewhat cumbersome to work with, compared to Java, Python or, well, C pointers. There are two aspects: preservation of array bounds in slices (so that you have to write Table (Table'First + Offset) to access the element Offset of Table, Offset ranging from 0 to Table'Length - 1), and the fact that is impossible to put an unconstrained array (of arbitrary length) into a constrained object (i.e., you need an indirection).

For many programming tasks, arrays might be at the wrong level of abstraction, but we have a lot of plumbing code which uses them heavily.

Garbage collection support would make it easier to introduce the indirection, but it would require a conservative collector at present, and those we have right now (Boehm-Dehmers-Weiser and the Go collectors) require a process-global view, touch signal handlers etc., so they do away with one significant Ada advantage (see below).

> Only compiler bugs can cause
buffer overflows in Ada, unless you're so foolhardy that you disable the
bounds checking.

The GNAT run-time is compiled without language-defined checks, and it used to have at least one buffer overflow in the Ada part. Many Ada libraries used to follow GNAT's example and disabled the checks as well, but this has changed during the last few years, it appears. Manual overflow checks are hampered by the fact that -gnato still isn't the default.

Ada doesn't do garbage collection across the whole program, but features
such as controlled types, generic data structures and out parameters
greatly reduce the need for garbage collection. The double-free problem
is also eliminated. (Garbage collection was made optional in Ada so
that the language would be suitable for embedded real-time systems, and
in practice most compilers don't provide it.)

Controlled types have a fixed overhead which is quite visible with small objects. By default, code for abort deferral is emitted, the vtable pointer takes space, and avoiding unnecessary indirect calls takes some care by the programmer. There's also no well-defined ABI for shared libraries (and adding a subprogram can change the name of existing subprograms).

On the other hand, lack of garbage collection means that it's feasible to have some GNAT-compiled part in a larger program, without the larger program noticing that there's a component not written in C. I sometimes call this "deep embedding support", and only very few language implementations have this property at present. (Even with GNAT, you have to restrict yourself to a language subset.) The list of feasible systems programming languages is much, much longer, but most need global run-time state, threads, signal handler manipulation, have address space layout requirements etc. But that is primarily an implementation issue, not an aspect which is inherent to most languages.

The other aspect is low baseline overhead from the run-time system. We don't want programmers to rewrite working system components in C only to reduce memory usage. This is what happened (or is expected to happen) to some daemons written in Python.

--
Florian Weimer / Red Hat Product Security Team
--
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux