On Thu, Jan 28, 2021, Paolo Bonzini wrote: > On 28/01/21 11:48, Maciej S. Szmigiero wrote: > > > > > > VMMs (especially big ones like QEMU) are complex and e.g. each driver > > > can cause memory regions (-> memslots in KVM) to change. With this > > > feature it becomes possible to set a limit upfront (based on VM > > > configuration) so it'll be more obvious when it's hit. > > > > > > > I see: it's a kind of a "big switch", so every VMM doesn't have to be > > modified or audited. > > Thanks for the explanation. > > Not really, it's the opposite: the VMM needs to opt into a smaller number of > memslots. Yep, my thinking is that it would be similar to using seccomp to prevent doing something that should never happen. > I don't know... I understand it would be defense in depth, however between > dynamic allocation of memslots arrays and GFP_KERNEL_ACCOUNT, it seems to be > a bit of a solution in search of a problem. I'm a-ok waiting to add a capability until there's a VMM that actually wants to use it. > For now I applied patches 1-2-5. Why keep patch 1? Simply raising the limit in patch 2 shouldn't require per-VM tracking. The 'memslots_max' name is also ambiguous. In my head, the new capability would restrict the _number_ of memslots, but as implemented in patches 1+3 it restrists the max _ID_ of a memslot. Limiting the max ID also effectively limits that max number of memslots, but that approach confuses things since the IDs themselves do not affect memory consumption. Limiting the IDs bleeds the old implementation details into the ABI.