On Mon, Nov 7, 2011 at 10:17 AM, Sasha Levin <levinsasha928@xxxxxxxxx> wrote: > Hi Avi, > > Thank you for your comments! > > Just one question below: > > On Mon, Nov 7, 2011 at 11:26 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: >> Crashing the guest is fine (not 100% - you can have unprivileged code >> managing a device, in which case we allow unprivileged code to crash the >> entire guest - but that's rare). Running code on the host is also fine; > > On Mon, Nov 7, 2011 at 11:26 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: >> One thing to beware of is memory hotplug. If the memory map is static, >> then a fork() once everything is set up (with MAP_SHARED) alllows all >> processes to access guest memory. However, if memory hotplug is >> supported (or planned to be supported), then you can't do that, as >> seccomp doesn't allow you to run mmap() in confined processes. >> >> This means they have to use RPC to the main process in order to access >> memory, which is going to slow them down significantly. > > Is the risk of a non-privileged guest code being able to exploit > hypervisor to access guest memory which it's not allowed to access is > really that small? I actually thought it would be one of the main > concerns we'd need to handle, but from what I understand from you it's > an irrelevant scenario. > > If it's really the case, then mapping guest memory is preferable. > While mmap() is an issue, I think it's a great example of why seccomp > filters are needed in the kernel, and might be a good chance to push > that feature forward. In that sense, 'Secure KVM' could be used as a > guinea pig both for seccomp filters and future QEMU work. This is a really interesting topic - something that we've discussed in QEMU as well. Doing it with seccomp is really hard since that only allows read(2), write(2), exit(2), and sigreturn(2). I think using seccomp means that host devices (e.g. actual network and block device I/O) are implemented outside the seccomp because it requires other syscalls. Then the seccomp process would simply do hardware emulation with IPCs for all actual I/O. Where does the VNC server, the image formats, etc go? It would be nice to confine them too. In that respect I think Avi's ideas about using safe programming languages (even if just a NaCl toolchain) are nice because they are more general and apply to all of the codebase. Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html