On Tue, Dec 07, 2021 at 09:41:32 +0000, Daniel P. Berrangé wrote: > On Tue, Dec 07, 2021 at 10:19:42AM +0100, Jiri Denemark wrote: > > Userfaultfd is by default allowed only for privileged processes. Since > > libvirt runs QEMU unprivileged, we need to enable unprivileged access to > > userfaultfd before starting post-copy migration. > > > > Rather than providing a static sysctl configuration file, we set the > > sysctl knob in runtime once post-copy migration is requested. This way > > unprivileged_userfaultfd is only enabled once actually used. > > I'm really not a fan of silently changing sysctl knobs on the > fly like this, as it means the change is essentially invisible > to the host admin. > > IIUC, the kernel change was made because of fear of risk this > feature exposes to the kernel when combined with other flaws. > > Now I don't know how valid that fear is, but given that starting > point, I think if we're going to change it, then the change ought > to be visible to admins in a fairly obvious way. > > IOW, we something ought to be droppping a file into /etc/sysctl.d/ > that enables it. The downside then is that it applies to all installs, > even if they don't migrate. The flipside is that a default of 1 has > been the historical value since postcopy first arrived, so all QEMU > installs always had this behaviour. > > If we drop in a file 50-qemu-postcopy.conf, someone else can drop > in a file 55-turn-it-off-again.conf to override our default. > > Stil this all feels so awful every way I look at it :-( Yes, neither option is nice. I chose the way of keeping the default setting until post-copy migration because it doesn't change anything for those who don't needed. But while we already set some sysctl knobs in runtime (networkEnableIPForwarding), one can argue we do so based on an explicit configuration in an XML rather than as a result of a flag passed to an API. I agree the sysctl config file makes this setting more visible and admins can turn off the knob if they want, which cannot be done when we change the setting unconditionally in runtime. That said, I'll prepare a v3 which will install a sysctl config file instead. Jirka