On 11/07/2011 05:27 AM, Stefan Hajnoczi wrote:
On Mon, Nov 7, 2011 at 10:17 AM, Sasha Levin<levinsasha928@xxxxxxxxx> wrote: This is a really interesting topic - something that we've discussed in QEMU as well. Doing it with seccomp is really hard since that only allows read(2), write(2), exit(2), and sigreturn(2). I think using seccomp means that host devices (e.g. actual network and block device I/O) are implemented outside the seccomp because it requires other syscalls. Then the seccomp process would simply do hardware emulation with IPCs for all actual I/O. Where does the VNC server, the image formats, etc go? It would be nice to confine them too. In that respect I think Avi's ideas about using safe programming languages (even if just a NaCl toolchain) are nice because they are more general and apply to all of the codebase.
It's a nice idea but the NaCL toolchain doesn't have a nice upstream story right now.
I think seccomp() mode 1 isn't so bad. It's difficult to boot strap, but once you have a reasonable set of RPCs, it shouldn't be all that bad of an environment to program in.
One way to think of a seccomp() sandbox is that it emulates the legacy device model and translates everything into an ultra-modern, no backwards compat, pure-virtio device model. From a QEMU perspective, it would treat the sandbox as part of the guest, and then implement a bare bones machine that only exposed the couple of virtio interfaces to the sandbox. QEMU would then bridge this to the various types of backends.
Regards, Anthony Liguori
Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html