On Mon, May 13, 2024, James Gowans wrote: > On Mon, 2024-05-13 at 08:39 -0700, Sean Christopherson wrote: > > > Sean, you mentioned that you envision guest_memfd also supporting non-CoCo VMs. > > > Do you have some thoughts about how to make the above cases work in the > > > guest_memfd context? > > > > Yes. The hand-wavy plan is to allow selectively mmap()ing guest_memfd(). There > > is a long thread[*] discussing how exactly we want to do that. The TL;DR is that > > the basic functionality is also straightforward; the bulk of the discussion is > > around gup(), reclaim, page migration, etc. > > I still need to read this long thread, but just a thought on the word > "restricted" here: for MMIO the instruction can be anywhere and > similarly the load/store MMIO data can be anywhere. Does this mean that > for running unmodified non-CoCo VMs with guest_memfd backend that we'll > always need to have the whole of guest memory mmapped? Not necessarily, e.g. KVM could re-establish the direct map or mremap() on-demand. There are variation on that, e.g. if ASI[*] were to ever make it's way upstream, which is a huge if, then we could have guest_memfd mapped into a KVM-only CR3. > I guess the idea is that this use case will still be subject to the > normal restriction rules, but for a non-CoCo non-pKVM VM there will be > no restriction in practice, and userspace will need to mmap everything > always? > > It really seems yucky to need to have all of guest RAM mmapped all the > time just for MMIO to work... But I suppose there is no way around that > for Intel x86. It's not just MMIO. Nested virtualization, and more specifically shadowing nested TDP, is also problematic (probably more so than MMIO). And there are more cases, i.e. we'll need a generic solution for this. As above, there are a variety of options, it's largely just a matter of doing the work. I'm not saying it's a trivial amount of work/effort, but it's far from an unsolvable problem.