On 1/9/2025 12:29 PM, Chenyi Qiang wrote: > > > On 1/9/2025 10:55 AM, Alexey Kardashevskiy wrote: >> >> >> On 9/1/25 13:11, Chenyi Qiang wrote: >>> >>> >>> On 1/8/2025 7:20 PM, Alexey Kardashevskiy wrote: >>>> >>>> >>>> On 8/1/25 21:56, Chenyi Qiang wrote: >>>>> >>>>> >>>>> On 1/8/2025 12:48 PM, Alexey Kardashevskiy wrote: >>>>>> On 13/12/24 18:08, Chenyi Qiang wrote: >>>>>>> As the commit 852f0048f3 ("RAMBlock: make guest_memfd require >>>>>>> uncoordinated discard") highlighted, some subsystems like VFIO might >>>>>>> disable ram block discard. However, guest_memfd relies on the discard >>>>>>> operation to perform page conversion between private and shared >>>>>>> memory. >>>>>>> This can lead to stale IOMMU mapping issue when assigning a hardware >>>>>>> device to a confidential VM via shared memory (unprotected memory >>>>>>> pages). Blocking shared page discard can solve this problem, but it >>>>>>> could cause guests to consume twice the memory with VFIO, which is >>>>>>> not >>>>>>> acceptable in some cases. An alternative solution is to convey other >>>>>>> systems like VFIO to refresh its outdated IOMMU mappings. >>>>>>> >>>>>>> RamDiscardManager is an existing concept (used by virtio-mem) to >>>>>>> adjust >>>>>>> VFIO mappings in relation to VM page assignment. Effectively page >>>>>>> conversion is similar to hot-removing a page in one mode and >>>>>>> adding it >>>>>>> back in the other, so the similar work that needs to happen in >>>>>>> response >>>>>>> to virtio-mem changes needs to happen for page conversion events. >>>>>>> Introduce the RamDiscardManager to guest_memfd to achieve it. >>>>>>> >>>>>>> However, guest_memfd is not an object so it cannot directly implement >>>>>>> the RamDiscardManager interface. >>>>>>> >>>>>>> One solution is to implement the interface in HostMemoryBackend. Any >>>>>> >>>>>> This sounds about right. >>>>>> >>>>>>> guest_memfd-backed host memory backend can register itself in the >>>>>>> target >>>>>>> MemoryRegion. However, this solution doesn't cover the scenario >>>>>>> where a >>>>>>> guest_memfd MemoryRegion doesn't belong to the HostMemoryBackend, >>>>>>> e.g. >>>>>>> the virtual BIOS MemoryRegion. >>>>>> >>>>>> What is this virtual BIOS MemoryRegion exactly? What does it look like >>>>>> in "info mtree -f"? Do we really want this memory to be DMAable? >>>>> >>>>> virtual BIOS shows in a separate region: >>>>> >>>>> Root memory region: system >>>>> 0000000000000000-000000007fffffff (prio 0, ram): pc.ram KVM >>>>> ... >>>>> 00000000ffc00000-00000000ffffffff (prio 0, ram): pc.bios KVM >>>> >>>> Looks like a normal MR which can be backed by guest_memfd. >>> >>> Yes, virtual BIOS memory region is initialized by >>> memory_region_init_ram_guest_memfd() which will be backed by a >>> guest_memfd. >>> >>> The tricky thing is, for Intel TDX (not sure about AMD SEV), the virtual >>> BIOS image will be loaded and then copied to private region. >>> After that, >>> the loaded image will be discarded and this region become useless. >> >> I'd think it is loaded as "struct Rom" and then copied to the MR- >> ram_guest_memfd() which does not leave MR useless - we still see >> "pc.bios" in the list so it is not discarded. What piece of code are you >> referring to exactly? > > Sorry for confusion, maybe it is different between TDX and SEV-SNP for > the vBIOS handling. > > In x86_bios_rom_init(), it initializes a guest_memfd-backed MR and loads > the vBIOS image to the shared part of the guest_memfd MR. For TDX, it > will copy the image to private region (not the vBIOS guest_memfd MR > private part) and discard the shared part. So, although the memory > region still exists, it seems useless. Correct myself. After some discussion internally, I found I misunderstood the vBIOS handling in TDX. The memory region is valid. It copies the vBIOS image to the private region (vBIOS guest_memfd private part). Sorry for confusion. > > It is different for SEV-SNP, correct? Does SEV-SNP manage the vBIOS in > vBIOS guest_memfd private memory? >