On Fri, 2014-08-01 at 09:27 +0100, Jan Beulich wrote: > >>> On 31.07.14 at 15:16, <frediano.ziglio@xxxxxxxxxx> wrote: > > Add a RESTRICT ioctl to /dev/xen/privcmd, which allows privileged commands > > file descriptor to be restricted to only working with a particular domain. > > The "with" here has been quite confusing, and I realized that you > mean the subject domain rather than the actor one only after > having gone through quite some parts of the patch. For a patch > this size, a little more of a description (and the original motivation) > would have helped. > Yes, you are right. > Wrt motivation: Why does this need enforcing in the kernel at all? > Doesn't XSM_DM_PRIV mode deal specifically with what you're > trying to do here? Or else I guess I really need some better > explanation of what this is about. > > Jan > This is quite old for me but you are right, perhaps is not that clear for other people. In XenServer we have some patches that allow Qemu running in dom0 but work only for a specific domain. The patches required changes to libxc, kernel and Qemu. We are reimplementing these patches as the old implementation has some problems (one is that the patch for libxc was quite big). This feature was removed as kernel patches did not work with newer (3.x) kernels. Now, XSM_DM_PRIV works checking if the domain target is the domain we are going to handle. However if your dom0 (as in XenServer) has all Qemu to handle all VMs it cannot be bound to a single target so XSM is not usable. Xen has no knowledge of process or file descriptor (which are kernel specific) so there is actually no way it can distinguish which domain should be restricted to. It would solve if the restriction would be done for system call (so we can say execute this hypercall(s) with these policies). However this require to change the target to be at least CPU specific and handle preemption correctly in order to not mix policies. This could be quite heavy so we hack the kernel in order to do the restriction instead (it also was easier to port the patches). Actually changes in Qemu to handle the privcmd/evtchn restrictions are quite small, mainly restrict these two handles with an ioctl. Other parts of the patch (chroot, setuid, groups, resource limits, and mostly xenstore accesses) are more heavy. Frediano -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html