On 31.07.24 09:12, Xu Yilun wrote:
On Fri, Jul 26, 2024 at 09:08:51AM +0200, David Hildenbrand wrote:
On 26.07.24 07:02, Tian, Kevin wrote:
From: David Hildenbrand <david@xxxxxxxxxx>
Sent: Thursday, July 25, 2024 10:04 PM
Open
====
Implementing a RamDiscardManager to notify VFIO of page conversions
causes changes in semantics: private memory is treated as discarded (or
hot-removed) memory. This isn't aligned with the expectation of current
RamDiscardManager users (e.g. VFIO or live migration) who really
expect that discarded memory is hot-removed and thus can be skipped
when
the users are processing guest memory. Treating private memory as
discarded won't work in future if VFIO or live migration needs to handle
private memory. e.g. VFIO may need to map private memory to support
Trusted IO and live migration for confidential VMs need to migrate
private memory.
"VFIO may need to map private memory to support Trusted IO"
I've been told that the way we handle shared memory won't be the way
this is going to work with guest_memfd. KVM will coordinate directly
with VFIO or $whatever and update the IOMMU tables itself right in the
kernel; the pages are pinned/owned by guest_memfd, so that will just
work. So I don't consider that currently a concern. guest_memfd private
memory is not mapped into user page tables and as it currently seems it
never will be.
Or could extend MAP_DMA to accept guest_memfd+offset in place of
With TIO, I can imagine several buffer sharing requirements: KVM maps VFIO
owned private MMIO, IOMMU maps gmem owned private memory, IOMMU maps VFIO
owned private MMIO. These buffers cannot be found by user page table
anymore. I'm wondering it would be messy to have specific PFN finding
methods for each FD type. Is it possible we have a unified way for
buffer sharing and PFN finding, is dma-buf a candidate?
No expert on that, so I'm afraid I can't help.
'vaddr' and have VFIO/IOMMUFD call guest_memfd helpers to retrieve
the pinned pfn.
In theory yes, and I've been thinking of the same for a while. Until people
told me that it is unlikely that it will work that way in the future.
Could you help specify why it won't work? As Kevin mentioned below, SEV-TIO
may still allow userspace to manage the IOMMU mapping for private. I'm
not sure how they map private memory for IOMMU without touching gmemfd.
I raised that question in [1]:
"How would the device be able to grab/access "private memory", if not
via the user page tables?"
Jason summarized it as "The approaches I'm aware of require the secure
world to own the IOMMU and generate the IOMMU page tables. So we will
not use a GUP approach with VFIO today as the kernel will not have any
reason to generate a page table in the first place. Instead we will say
"this PCI device translates through the secure world" and walk away."
I think for some cVM approaches it really cannot work without letting
KVM/secure world handle the IOMMU (e.g., sharing of page tables between
IOMMU and KVM).
For your use case it *might* work, but I am wondering if this is how it
should be done, and if there are better alternatives.
[1] https://lkml.org/lkml/2024/6/20/920
--
Cheers,
David / dhildenb