Re: folio_mmapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

...

>>>> "mmap() the whole thing once and only access what you are supposed to
> >   (> > > access" sounds reasonable to me. If you don't play by the rules, you get a
> >>>> signal.
> >>>
> >>> "... you get a signal, or maybe you don't". But yes I understand your
> >>> point, and as per the above there are real benefits to this approach so
> >>> why not.
> >>>
> >>> What do we expect userspace to do when a page goes from shared back to
> >>> being guest-private, because e.g. the guest decides to unshare? Use
> >>> munmap() on that page? Or perhaps an madvise() call of some sort? Note
> >>> that this will be needed when starting a guest as well, as userspace
> >>> needs to copy the guest payload in the guestmem file prior to starting
> >>> the protected VM.
> >>
> >> Let's assume we have the whole guest_memfd mapped exactly once in our
> >> process, a single VMA.
> >>
> >> When setting up the VM, we'll write the payload and then fire up the VM.
> >>
> >> That will (I assume) trigger some shared -> private conversion.
> >>
> >> When we want to convert shared -> private in the kernel, we would first
> >> check if the page is currently mapped. If it is, we could try unmapping that
> >> page using an rmap walk.
> >
> > I had not considered that. That would most certainly be slow, but a well
> > behaved userspace process shouldn't hit it so, that's probably not a
> > problem...
>
> If there really only is a single VMA that covers the page (or even mmaps
> the guest_memfd), it should not be too bad. For example, any
> fallocate(PUNCHHOLE) has to do the same, to unmap the page before
> discarding it from the pagecache.

I don't think that we can assume that only a single VMA covers a page.

> But of course, no rmap walk is always better.

We've been thinking some more about how to handle the case where the
host userspace has a mapping of a page that later becomes private.

One idea is to refuse to run the guest (i.e., exit vcpu_run() to back
to the host with a meaningful exit reason) until the host unmaps that
page, and check for the refcount to the page as you mentioned earlier.
This is essentially what the RFC I sent does (minus the bugs :) ) .

The other idea is to use the rmap walk as you suggested to zap that
page. If the host tries to access that page again, it would get a
SIGBUS on the fault. This has the advantage that, as you'd mentioned,
the host doesn't need to constantly mmap() and munmap() pages. It
could potentially be optimised further as suggested if we have a
cooperating VMM that would issue a MADV_DONTNEED or something like
that, but that's just an optimisation and we would still need to have
the option of the rmap walk. However, I was wondering how practical
this idea would be if more than a single VMA covers a page?

Also, there's the question of what to do if the page is gupped? In
this case I think the only thing we can do is refuse to run the guest
until the gup (and all references) are released, which also brings us
back to the way things (kind of) are...

Thanks,
/fuad




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux