On Mon, Jan 20, 2025 at 06:54:14PM +0100, David Hildenbrand wrote: > On 20.01.25 18:21, Peter Xu wrote: > > On Mon, Jan 20, 2025 at 11:48:39AM +0100, David Hildenbrand wrote: > > > Sorry, I was traveling end of last week. I wrote a mail on the train and > > > apparently it was swallowed somehow ... > > > > > > > > Not sure that's the right place. Isn't it the (cc) machine that controls > > > > > the state? > > > > > > > > KVM does, via MemoryRegion->RAMBlock->guest_memfd. > > > > > > Right; I consider KVM part of the machine. > > > > > > > > > > > > > > > It's not really the memory backend, that's just the memory provider. > > > > > > > > Sorry but is not "providing memory" the purpose of "memory backend"? :) > > > > > > Hehe, what I wanted to say is that a memory backend is just something to > > > create a RAMBlock. There are different ways to create a RAMBlock, even > > > guest_memfd ones. > > > > > > guest_memfd is stored per RAMBlock. I assume the state should be stored per > > > RAMBlock as well, maybe as part of a "guest_memfd state" thing. > > > > > > Now, the question is, who is the manager? > > > > > > 1) The machine. KVM requests the machine to perform the transition, and the > > > machine takes care of updating the guest_memfd state and notifying any > > > listeners. > > > > > > 2) The RAMBlock. Then we need some other Object to trigger that. Maybe > > > RAMBlock would have to become an object, or we allocate separate objects. > > > > > > I'm leaning towards 1), but I might be missing something. > > > > A pure question: how do we process the bios gmemfds? I assume they're > > shared when VM starts if QEMU needs to load the bios into it, but are they > > always shared, or can they be converted to private later? > > You're probably looking for memory_region_init_ram_guest_memfd(). Yes, but I didn't see whether such gmemfd needs conversions there. I saw an answer though from Chenyi in another email: https://lore.kernel.org/all/fc7194ee-ed21-4f6b-bf87-147a47f5f074@xxxxxxxxx/ So I suppose the BIOS region must support private / share conversions too, just like the rest part. Though in that case, I'm not 100% sure whether that could also be done by reusing the major guest memfd with some specific offset regions. > > > > > I wonder if it's possible (now, or in the future so it can be >2 fds) that > > a VM can contain multiple guest_memfds, meanwhile they request different > > security levels. Then it could be more future proof that such idea be > > managed per-fd / per-ramblock / .. rather than per-VM. For example, always > > shared gmemfds can avoid the manager but be treated like normal memories, > > while some gmemfds can still be confidential to install the manager. > > I think all of that is possible with whatever design we chose. > > The situation is: > > * guest_memfd is per RAMBlock (block->guest_memfd set in ram_block_add) > * Some RAMBlocks have a memory backend, others do not. In particular, > the ones calling memory_region_init_ram_guest_memfd() do not. > > So the *guest_memfd information* (fd, bitmap) really must be stored per > RAMBlock. > > The question *which object* implements the RamDiscardManager interface to > manage the RAMBlocks that have a guest_memfd. > > We either need > > 1) Something attached to the RAMBlock or the RAMBlock itself. This > series does it via a new object attached to the RAMBlock. > 2) A per-VM entity (e.g., machine, distinct management object) > > In case of 1) KVM looks up the RAMBlock->object to trigger the state change. > That object will inform all listeners. > > In case of 2) KVM calls the per-VM entity (e.g., guest_memfd manager), which > looks up the RAMBlock and triggers the state change. It will inform all > listeners. (after I finished reading the whole discussion..) Looks like Yilun raised another point, on how to reuse the same object for device TIO support here (conversions for device MMIOs): https://lore.kernel.org/r/https://lore.kernel.org/all/Z4RA1vMGFECmYNXp@yilunxu-OptiPlex-7050/ Thanks, -- Peter Xu