On Fri, 2024-11-01 at 10:42 -0300, Jason Gunthorpe wrote: > > On Fri, Nov 01, 2024 at 01:01:00PM +0000, Gowans, James wrote: > > > Thanks Jason, that sounds perfect. I'll work on the next rev which will: > > - expose a filesystem which owns reserved/persistent memory, just like > > this patch. > > Is this step needed? > > If the guest memfd is already told to get 1G pages in some normal way, > why do we need a dedicated pool just for the KHO filesystem? > > Back to my suggestion, can't KHO simply freeze the guest memfd and > then extract the memory layout, and just use the normal allocator? > > Or do you have a hard requirement that only KHO allocated memory can > be preserved across kexec? KHO can persist any memory ranges which are not MOVABLE. Provided that guest_memfd does non-movable allocations then serialising and persisting should be possible. There are other requirements here, specifically the ability to be *guaranteed* GiB-level allocations, have the guest memory out of the direct map for secret hiding, and remove the struct page overhead. Struct page overhead could be handled via HVO. But considering that the memory must be out of the direct map it seems unnecessary to have struct pages, and unnecessary to have it managed by an existing allocator. The only existing 1 GiB allocator I know of is hugetlbfs? Let me know if there's something else that can be used. That's the main motivation for a separate pool allocated on early boot. This is quite similar to hugetlbfs, so a natural question is if we could use and serialise hugetlbfs instead, but that probably opens another can of worms of complexity. There's more than just the guest_memfds and their allocations to serialise; it's probably useful to be able to have a directory structure in the filesystem, POSIX file ACLs, and perhaps some other filesystem metadata. For this reason I still think that having a new filesystem designed for this use-case which creates guest_memfd objects when files are opened is the way to go. Let me know what you think. JG