On Thu, 2024-10-31 at 13:06 -0300, Jason Gunthorpe wrote: > On Thu, Oct 31, 2024 at 03:30:59PM +0000, Gowans, James wrote: > > On Tue, 2024-10-29 at 16:05 -0700, Elliot Berman wrote: > > > On Mon, Aug 05, 2024 at 11:32:40AM +0200, James Gowans wrote: > > > > Make the file data usable to userspace by adding mmap. That's all that > > > > QEMU needs for guest RAM, so that's all be bother implementing for now. > > > > > > > > When mmaping the file the VMA is marked as PFNMAP to indicate that there > > > > are no struct pages for the memory in this VMA. Remap_pfn_range() is > > > > used to actually populate the page tables. All PTEs are pre-faulted into > > > > the pgtables at mmap time so that the pgtables are usable when this > > > > virtual address range is given to VFIO's MAP_DMA. > > > > > > Thanks for sending this out! I'm going through the series with the > > > intention to see how it might fit within the existing guest_memfd work > > > for pKVM/CoCo/Gunyah. > > > > > > It might've been mentioned in the MM alignment session -- you might be > > > interested to join the guest_memfd bi-weekly call to see how we are > > > overlapping [1]. > > > > > > [1]: https://lore.kernel.org/kvm/ae794891-fe69-411a-b82e-6963b594a62a@xxxxxxxxxx/T/ > > > > Hi Elliot, yes, I think that there is a lot more overlap with > > guest_memfd necessary here. The idea was to extend guestmemfs at some > > point to have a guest_memfd style interface, but it was pointed out at > > the MM alignment call that doing so would require guestmemfs to > > duplicate the API surface of guest_memfd. This is undesirable. Better > > would be to have persistence implemented as a custom allocator behind a > > normal guest_memfd. I'm not too sure how this would be actually done in > > practice, specifically: > > - how the persistent pool would be defined > > - how it would be supplied to guest_memfd > > - how the guest_memfds would be re-discovered after kexec > > But assuming we can figure out some way to do this, I think it's a > > better way to go. > > I think the filesystem interface seemed reasonable, you just want > open() on the filesystem to return back a normal guest_memfd and > re-use all of that code to implement it. > > When opened through the filesystem guest_memfd would get hooked by the > KHO stuff to manage its memory, somehow. > > Really KHO just needs to keep track of the addresess in the > guest_memfd when it serializes, right? So maybe all it needs is a way > to freeze the guest_memfd so it's memory map doesn't change anymore, > then a way to extract the addresses from it for serialization? Thanks Jason, that sounds perfect. I'll work on the next rev which will: - expose a filesystem which owns reserved/persistent memory, just like this patch. - rebased on top of the patches which pull out the guest_memfd code into a library - rebased on top of the guest_memfd patches which supports adding a different backing allocator (hugetlbfs) to guest_memfd - when a file in guestmemfs is opened, create a guest_memfd object from the guest_memfd library code and set guestmemfs as the custom allocator for the file. - serialise and re-hydrate the guest_memfds which have been created in guestmemfs on kexec via KHO. The main difference is that opening a guestmemfs file won't give a regular file, rather it will give a guest_memfd library object. This will give good code re-used with guest_memfd library and prevent needing to re-implement the guest_memfd API surface here. Sounds like a great path forward. :-) JG > > Jason