On 1/24/2025 8:15 AM, Alexey Kardashevskiy wrote: > > > On 22/1/25 16:38, Xiaoyao Li wrote: >> On 1/22/2025 11:28 AM, Chenyi Qiang wrote: >>> >>> >>> On 1/22/2025 12:35 AM, Peter Xu wrote: >>>> On Tue, Jan 21, 2025 at 09:35:26AM +0800, Chenyi Qiang wrote: >>>>> >>>>> >>>>> On 1/21/2025 2:33 AM, Peter Xu wrote: >>>>>> On Mon, Jan 20, 2025 at 06:54:14PM +0100, David Hildenbrand wrote: >>>>>>> On 20.01.25 18:21, Peter Xu wrote: >>>>>>>> On Mon, Jan 20, 2025 at 11:48:39AM +0100, David Hildenbrand wrote: >>>>>>>>> Sorry, I was traveling end of last week. I wrote a mail on the >>>>>>>>> train and >>>>>>>>> apparently it was swallowed somehow ... >>>>>>>>> >>>>>>>>>>> Not sure that's the right place. Isn't it the (cc) machine >>>>>>>>>>> that controls >>>>>>>>>>> the state? >>>>>>>>>> >>>>>>>>>> KVM does, via MemoryRegion->RAMBlock->guest_memfd. >>>>>>>>> >>>>>>>>> Right; I consider KVM part of the machine. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> It's not really the memory backend, that's just the memory >>>>>>>>>>> provider. >>>>>>>>>> >>>>>>>>>> Sorry but is not "providing memory" the purpose of "memory >>>>>>>>>> backend"? :) >>>>>>>>> >>>>>>>>> Hehe, what I wanted to say is that a memory backend is just >>>>>>>>> something to >>>>>>>>> create a RAMBlock. There are different ways to create a >>>>>>>>> RAMBlock, even >>>>>>>>> guest_memfd ones. >>>>>>>>> >>>>>>>>> guest_memfd is stored per RAMBlock. I assume the state should >>>>>>>>> be stored per >>>>>>>>> RAMBlock as well, maybe as part of a "guest_memfd state" thing. >>>>>>>>> >>>>>>>>> Now, the question is, who is the manager? >>>>>>>>> >>>>>>>>> 1) The machine. KVM requests the machine to perform the >>>>>>>>> transition, and the >>>>>>>>> machine takes care of updating the guest_memfd state and >>>>>>>>> notifying any >>>>>>>>> listeners. >>>>>>>>> >>>>>>>>> 2) The RAMBlock. Then we need some other Object to trigger >>>>>>>>> that. Maybe >>>>>>>>> RAMBlock would have to become an object, or we allocate >>>>>>>>> separate objects. >>>>>>>>> >>>>>>>>> I'm leaning towards 1), but I might be missing something. >>>>>>>> >>>>>>>> A pure question: how do we process the bios gmemfds? I assume >>>>>>>> they're >>>>>>>> shared when VM starts if QEMU needs to load the bios into it, >>>>>>>> but are they >>>>>>>> always shared, or can they be converted to private later? >>>>>>> >>>>>>> You're probably looking for memory_region_init_ram_guest_memfd(). >>>>>> >>>>>> Yes, but I didn't see whether such gmemfd needs conversions >>>>>> there. I saw >>>>>> an answer though from Chenyi in another email: >>>>>> >>>>>> https://lore.kernel.org/all/fc7194ee-ed21-4f6b- >>>>>> bf87-147a47f5f074@xxxxxxxxx/ >>>>>> >>>>>> So I suppose the BIOS region must support private / share >>>>>> conversions too, >>>>>> just like the rest part. >>>>> >>>>> Yes, the BIOS region can support conversion as well. I think >>>>> guest_memfd >>>>> backed memory regions all follow the same sequence during setup time: >>>>> >>>>> guest_memfd is shared when the guest_memfd fd is created by >>>>> kvm_create_guest_memfd() in ram_block_add(), But it will sooner be >>>>> converted to private just after kvm_set_user_memory_region() in >>>>> kvm_set_phys_mem(). So at the boot time of cc VM, the default >>>>> attribute >>>>> is private. During runtime, the vBIOS can also do the conversion if it >>>>> wants. >>>> >>>> I see. >>>> >>>>> >>>>>> >>>>>> Though in that case, I'm not 100% sure whether that could also be >>>>>> done by >>>>>> reusing the major guest memfd with some specific offset regions. >>>>> >>>>> Not sure if I understand you clearly. guest_memfd is per-Ramblock. It >>>>> will have its own slot. So the vBIOS can use its own guest_memfd to >>>>> get >>>>> the specific offset regions. >>>> >>>> Sorry to be confusing, please feel free to ignore my previous comment. >>>> That came from a very limited mindset that maybe one confidential VM >>>> should >>>> only have one gmemfd.. >>>> >>>> Now I see it looks like it's by design open to multiple gmemfds for >>>> each >>>> VM, then it's definitely ok that bios has its own. >>>> >>>> Do you know why the bios needs to be convertable? I wonder whether >>>> the VM >>>> can copy it over to a private region and do whatever it wants, e.g. >>>> attest >>>> the bios being valid. However this is also more of a pure >>>> question.. and >>>> it can be offtopic to this series, so feel free to ignore. >>> >>> AFAIK, the vBIOS won't do conversion after it is set as private at the >>> beginning. But in theory, the VM can do the conversion at runtime with >>> current implementation. As for why make the vBIOS convertable, I'm also >>> uncertain about it. Maybe convenient for managing the private/shared >>> status by guest_memfd as it's also converted once at the beginning. >> >> The reason is just that we are too lazy to implement a variant of >> guest memfd for vBIOS that is disallowed to be converted from private >> to shared. > > What is the point in disallowing such conversion in QEMU? On AMD, a > malicious HV can try converting at any time and if the guest did not ask > for it, it will continue accessing those pages as private and trigger an > RMP fault. But if the guest asked for conversion, then it should be no > problem to convert to shared. What do I miss about TDX here? Thanks, Re-read Peter's question, maybe I misunderstood it a little bit. I thought Peter asked why the vBIOS need to do page conversion since it would keep private and no need to convert to shared at runtime. So it is not necessary to manage the vBIOS with guest_memfd-backed memory region as it only converts to private once during setup stage. Xiaoyao mentioned no need to implement a variant of guest_memfd to convert from private to shared. As you said, allowing such conversion won't bring security issues. Now, I assume Peter's real question is, if we can copy the vBIOS to a private region and no need to create a specific guest_memfd-backed memory region for it? > > >> >>>> >>>> Thanks, >>>> >>> >> >