On Mon, 2021-03-08 at 15:57 +0000, Daniel P. Berrangé wrote: > On Mon, Mar 08, 2021 at 04:32:26PM +0100, Andrea Bolognani wrote: > > On Mon, 2021-03-08 at 13:17 +0000, Daniel P. Berrangé wrote: > > > Since you added code to parse existing limits from /proc, I'm wondering > > > if we can just do without the config option. Simply try to use prlimit > > > and if it fails, query existing limits to determine if we sould treat > > > the prlimit as fatal or ignore it. Overall I'd prefer libvirt to > > > "just work" out of the box rather than requiring people to know about > > > setting a "make-vfio-hotplug-work=yes" flag in the config file. > > > > The problem with that approach is what to do when *lowering* the > > limit, for example as a consequence of hot-unplugging the last VFIO > > device from the VM. > > > > If we're controlling the memory locking limit ourselves, then failure > > to lower it should be an error, because leaving the limit much higher > > than necessary creates potential for DoS by a compromised QEMU; on > > the other hand, if the limit is controlled by an external process, > > all we can really do is assume they will do the right thing after > > hot-unplugging has happened. > > IMHO once QEMU vCPUs start running, immediately assume QEMU is > compromised / hostile. IOW, the DoS risk arrived the moment it > was given the higher limit. We're just failing to close off the > existing risk we've already accepted, which doesn't worry me much. > > On unplug the only thing we actually do when memory lock reduce > fails is to log a warning message, it is never treated as a > fatal error. > > So the only difference is whether we skip the warning message > when we get EPERM from prlimit(), or always emit the warning. You're right, we're currently just soft-failing when we can't lower the memlock limit on unplug. Given this and your assessment of the security implications, which I trust, we should indeed be able to avoid introducing the qemu.conf knob and just behave sanely in all scenarios out of the box. I'll give it a try. -- Andrea Bolognani / Red Hat / Virtualization