On Tue, Nov 02, 2021 at 09:33:55AM +0100, David Hildenbrand wrote: > On 01.11.21 23:15, Michael S. Tsirkin wrote: > > On Wed, Oct 27, 2021 at 02:45:19PM +0200, David Hildenbrand wrote: > >> This is the follow-up of [1], dropping auto-detection and vhost-user > >> changes from the initial RFC. > >> > >> Based-on: 20211011175346.15499-1-david@xxxxxxxxxx > >> > >> A virtio-mem device is represented by a single large RAM memory region > >> backed by a single large mmap. > >> > >> Right now, we map that complete memory region into guest physical addres > >> space, resulting in a very large memory mapping, KVM memory slot, ... > >> although only a small amount of memory might actually be exposed to the VM. > >> > >> For example, when starting a VM with a 1 TiB virtio-mem device that only > >> exposes little device memory (e.g., 1 GiB) towards the VM initialliy, > >> in order to hotplug more memory later, we waste a lot of memory on metadata > >> for KVM memory slots (> 2 GiB!) and accompanied bitmaps. Although some > >> optimizations in KVM are being worked on to reduce this metadata overhead > >> on x86-64 in some cases, it remains a problem with nested VMs and there are > >> other reasons why we would want to reduce the total memory slot to a > >> reasonable minimum. > >> > >> We want to: > >> a) Reduce the metadata overhead, including bitmap sizes inside KVM but also > >> inside QEMU KVM code where possible. > >> b) Not always expose all device-memory to the VM, to reduce the attack > >> surface of malicious VMs without using userfaultfd. > > > > I'm confused by the mention of these security considerations, > > and I expect users will be just as confused. > > Malicious VMs wanting to consume more memory than desired is only > relevant when running untrusted VMs in some environments, and it can be > caught differently, for example, by carefully monitoring and limiting > the maximum memory consumption of a VM. We have the same issue already > when using virtio-balloon to logically unplug memory. For me, it's a > secondary concern ( optimizing a is much more important ). > > Some users showed interest in having QEMU disallow access to unplugged > memory, because coming up with a maximum memory consumption for a VM is > hard. This is one step into that direction without having to run with > uffd enabled all of the time. Sorry about missing the memo - is there a lot of overhead associated with uffd then? > ("security is somewhat the wrong word. we won't be able to steal any > information from the hypervisor.) Right. Let's just spell it out. Further, removing memory still requires guest cooperation. > > > So let's say user wants to not be exposed. What value for > > the option should be used? What if a lower option is used? > > Is there still some security advantage? > > My recommendation will be to use 1 memslot per gigabyte as default if > possible in the configuration. If we have a virtio-mem devices with a > maximum size of 128 GiB, the suggestion will be to use memslots=128. > Some setups will require less (e.g., vhost-user until adjusted, old > KVM), some setups can allow for more. I assume that most users will > later set "memslots=0", to enable auto-detection mode. > > > Assume we have a virtio-mem device with a maximum size of 1 TiB and we > hotplugged 1 GiB to the VM. With "memslots=1", the malicious VM could > actually access the whole 1 TiB. With "memslots=1024", the malicious VM > could only access additional ~ 1 GiB. With "memslots=512", ~ 2 GiB. > That's the reduced attack surface. > > Of course, it's different after we hotunplugged memory, before we have > VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE support in QEMU, because all memory > inside the usable region has to be accessible and we cannot "unplug" the > memslots. > > > Note: With upcoming VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE changes in QEMU, > one will be able to disallow any access for malicious VMs by setting the > memblock size just as big as the device block size. > > So with a 128 GiB virtio-mem device with memslots=128,block-size=1G, or > with memslots=1024,block-size=128M we could make it impossible for a > malicious VM to consume more memory than intended. But we lose > flexibility due to the block size and the limited number of available > memslots. > > But again, for "full protection against malicious VMs" I consider > userfaultfd protection more flexible. This approach here gives some > advantage, especially when having large virtio-mem devices that start > out small. > > -- > Thanks, > > David / dhildenb