On 3/20/24 09:39, Michael Roth wrote:
Some subsystems like VFIO might disable ram block discard for uncoordinated cases. Since kvm_convert_memory()/guest_memfd don't implement a RamDiscardManager handler to convey discard operations to various listeners like VFIO. > Because of this, sequences like the following can result due to stale IOMMU mappings:
Alternatively, should guest-memfd memory regions call ram_block_discard_require(true)? This will prevent VFIO from operating, but it will avoid consuming twice the memory.
If desirable, guest-memfd support can be changed to implement an extension of RamDiscardManager that notifies about private/shared memory changes, and then guest-memfd would be able to support coordinated discard. But I wonder if that's doable at all - how common are shared<->private flips, and is it feasible to change the IOMMU page tables every time?
If the real solution is SEV-TIO (which means essentially guest_memfd support for VFIO), calling ram_block_discard_require(true) may be the simplest stopgap solution.
Paolo
- convert page shared->private - discard shared page - convert page private->shared - new page is allocated - issue DMA operations against that shared page Address this by taking ram_block_discard_is_enabled() into account when deciding whether or not to discard pages. Signed-off-by: Michael Roth <michael.roth@xxxxxxx> --- accel/kvm/kvm-all.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 53ce4f091e..6ae03c880f 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2962,10 +2962,14 @@ static int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private) */ return 0; } else { - ret = ram_block_discard_range(rb, offset, size); + ret = ram_block_discard_is_disabled() + ? ram_block_discard_range(rb, offset, size) + : 0; } } else { - ret = ram_block_discard_guest_memfd_range(rb, offset, size); + ret = ram_block_discard_is_disabled() + ? ram_block_discard_guest_memfd_range(rb, offset, size) + : 0; } } else { error_report("Convert non guest_memfd backed memory region "