Re: Unmapping KVM Guest Memory from Host Kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-05-13 at 10:09 -0700, Sean Christopherson wrote:
> On Mon, May 13, 2024, James Gowans wrote:
> > On Mon, 2024-05-13 at 08:39 -0700, Sean Christopherson wrote:
> > > > Sean, you mentioned that you envision guest_memfd also supporting non-CoCo VMs.
> > > > Do you have some thoughts about how to make the above cases work in the
> > > > guest_memfd context?
> > > 
> > > Yes.  The hand-wavy plan is to allow selectively mmap()ing guest_memfd().  There
> > > is a long thread[*] discussing how exactly we want to do that.  The TL;DR is that
> > > the basic functionality is also straightforward; the bulk of the discussion is
> > > around gup(), reclaim, page migration, etc.
> > 
> > I still need to read this long thread, but just a thought on the word
> > "restricted" here: for MMIO the instruction can be anywhere and
> > similarly the load/store MMIO data can be anywhere. Does this mean that
> > for running unmodified non-CoCo VMs with guest_memfd backend that we'll
> > always need to have the whole of guest memory mmapped?
> 
> Not necessarily, e.g. KVM could re-establish the direct map or mremap() on-demand.
> There are variation on that, e.g. if ASI[*] were to ever make it's way upstream,
> which is a huge if, then we could have guest_memfd mapped into a KVM-only CR3.

Yes, on-demand mapping in of guest RAM pages is definitely an option. It
sounds quite challenging to need to always go via interfaces which
demand map/fault memory, and also potentially quite slow needing to
unmap and flush afterwards. 

Not too sure what you have in mind with "guest_memfd mapped into KVM-
only CR3" - could you expand?

> > I guess the idea is that this use case will still be subject to the
> > normal restriction rules, but for a non-CoCo non-pKVM VM there will be
> > no restriction in practice, and userspace will need to mmap everything
> > always?
> > 
> > It really seems yucky to need to have all of guest RAM mmapped all the
> > time just for MMIO to work... But I suppose there is no way around that
> > for Intel x86.
> 
> It's not just MMIO.  Nested virtualization, and more specifically shadowing nested
> TDP, is also problematic (probably more so than MMIO).  And there are more cases,
> i.e. we'll need a generic solution for this.  As above, there are a variety of
> options, it's largely just a matter of doing the work.  I'm not saying it's a
> trivial amount of work/effort, but it's far from an unsolvable problem.

I didn't even think of nested virt, but that will absolutely be an even
bigger problem too. MMIO was just the first roadblock which illustrated
the problem.
Overall what I'm trying to figure out is whether there is any sane path
here other than needing to mmap all guest RAM all the time. Trying to
get nested virt and MMIO and whatever else needs access to guest RAM
working by doing just-in-time (aka: on-demand) mappings and unmappings
of guest RAM sounds like a painful game of whack-a-mole, potentially
really bad for performance too.

Do you think we should look at doing this on-demand mapping, or, for
now, simply require that all guest RAM is mmapped all the time and KVM
be given a valid virtual addr for the memslots?
Note that I'm specifically referring to regular non-CoCo non-enlightened
VMs here. For CoCo we definitely need all the cooperative MMIO and
sharing. What we're trying to do here is to get guest RAM out of the
direct map using guest_memfd, and now tackling the knock-on problem of
whether or not to mmap all of guest RAM all the time in userspace.

JG




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux