On 1/21/2025 2:33 AM, Peter Xu wrote: > On Mon, Jan 20, 2025 at 06:54:14PM +0100, David Hildenbrand wrote: >> On 20.01.25 18:21, Peter Xu wrote: >>> On Mon, Jan 20, 2025 at 11:48:39AM +0100, David Hildenbrand wrote: >>>> Sorry, I was traveling end of last week. I wrote a mail on the train and >>>> apparently it was swallowed somehow ... >>>> >>>>>> Not sure that's the right place. Isn't it the (cc) machine that controls >>>>>> the state? >>>>> >>>>> KVM does, via MemoryRegion->RAMBlock->guest_memfd. >>>> >>>> Right; I consider KVM part of the machine. >>>> >>>> >>>>> >>>>>> It's not really the memory backend, that's just the memory provider. >>>>> >>>>> Sorry but is not "providing memory" the purpose of "memory backend"? :) >>>> >>>> Hehe, what I wanted to say is that a memory backend is just something to >>>> create a RAMBlock. There are different ways to create a RAMBlock, even >>>> guest_memfd ones. >>>> >>>> guest_memfd is stored per RAMBlock. I assume the state should be stored per >>>> RAMBlock as well, maybe as part of a "guest_memfd state" thing. >>>> >>>> Now, the question is, who is the manager? >>>> >>>> 1) The machine. KVM requests the machine to perform the transition, and the >>>> machine takes care of updating the guest_memfd state and notifying any >>>> listeners. >>>> >>>> 2) The RAMBlock. Then we need some other Object to trigger that. Maybe >>>> RAMBlock would have to become an object, or we allocate separate objects. >>>> >>>> I'm leaning towards 1), but I might be missing something. >>> >>> A pure question: how do we process the bios gmemfds? I assume they're >>> shared when VM starts if QEMU needs to load the bios into it, but are they >>> always shared, or can they be converted to private later? >> >> You're probably looking for memory_region_init_ram_guest_memfd(). > > Yes, but I didn't see whether such gmemfd needs conversions there. I saw > an answer though from Chenyi in another email: > > https://lore.kernel.org/all/fc7194ee-ed21-4f6b-bf87-147a47f5f074@xxxxxxxxx/ > > So I suppose the BIOS region must support private / share conversions too, > just like the rest part. Yes, the BIOS region can support conversion as well. I think guest_memfd backed memory regions all follow the same sequence during setup time: guest_memfd is shared when the guest_memfd fd is created by kvm_create_guest_memfd() in ram_block_add(), But it will sooner be converted to private just after kvm_set_user_memory_region() in kvm_set_phys_mem(). So at the boot time of cc VM, the default attribute is private. During runtime, the vBIOS can also do the conversion if it wants. > > Though in that case, I'm not 100% sure whether that could also be done by > reusing the major guest memfd with some specific offset regions. Not sure if I understand you clearly. guest_memfd is per-Ramblock. It will have its own slot. So the vBIOS can use its own guest_memfd to get the specific offset regions. > >> >>> >>> I wonder if it's possible (now, or in the future so it can be >2 fds) that >>> a VM can contain multiple guest_memfds, meanwhile they request different >>> security levels. Then it could be more future proof that such idea be >>> managed per-fd / per-ramblock / .. rather than per-VM. For example, always >>> shared gmemfds can avoid the manager but be treated like normal memories, >>> while some gmemfds can still be confidential to install the manager. >> >> I think all of that is possible with whatever design we chose. >> >> The situation is: >> >> * guest_memfd is per RAMBlock (block->guest_memfd set in ram_block_add) >> * Some RAMBlocks have a memory backend, others do not. In particular, >> the ones calling memory_region_init_ram_guest_memfd() do not. >> >> So the *guest_memfd information* (fd, bitmap) really must be stored per >> RAMBlock. >> >> The question *which object* implements the RamDiscardManager interface to >> manage the RAMBlocks that have a guest_memfd. >> >> We either need >> >> 1) Something attached to the RAMBlock or the RAMBlock itself. This >> series does it via a new object attached to the RAMBlock. >> 2) A per-VM entity (e.g., machine, distinct management object) >> >> In case of 1) KVM looks up the RAMBlock->object to trigger the state change. >> That object will inform all listeners. >> >> In case of 2) KVM calls the per-VM entity (e.g., guest_memfd manager), which >> looks up the RAMBlock and triggers the state change. It will inform all >> listeners. > > (after I finished reading the whole discussion..) > > Looks like Yilun raised another point, on how to reuse the same object for > device TIO support here (conversions for device MMIOs): > > https://lore.kernel.org/r/https://lore.kernel.org/all/Z4RA1vMGFECmYNXp@yilunxu-OptiPlex-7050/ > > Thanks, >