Re: [RFC PATCH v1 00/26] KVM: Restricted mapping of guest_memfd at the host and pKVM/arm64 support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elliot,

On Thu, Feb 22, 2024 at 11:44 PM Elliot Berman <quic_eberman@xxxxxxxxxxx> wrote:
>
> On Thu, Feb 22, 2024 at 04:10:21PM +0000, Fuad Tabba wrote:
> > This series adds restricted mmap() support to guest_memfd [1], as
> > well as support guest_memfd on pKVM/arm64.
> >
> > We haven't started using this in Android yet, but we aim to move
> > away from anonymous memory to guest_memfd once we have the
> > necessary support merged upstream. Others (e.g., Gunyah [8]) are
> > also looking into guest_memfd for similar reasons as us.
>
> I'm especially interested to see if we can factor out much of the common
> implementation bits between KVM and Gunyah. In principle, we're doing
> same thing: the difference is the exact mechanics to interact with the
> hypervisor which (I think) could be easily extracted into an ops
> structure.

I agree. We should share and reuse as much code as possible. I'll sync
with you before the V2 of this series.

> [...]
>
> > In addition to general feedback, we would like feedback on how we
> > handle mmap() and faulting-in guest pages at the host (KVM: Add
> > restricted support for mapping guest_memfd by the host).
> >
> > We don't enforce the invariant that only memory shared with the
> > host can be mapped by the host userspace in
> > file_operations::mmap(), but in vm_operations_struct:fault(). On
> > vm_operations_struct::fault(), we check whether the page is
> > shared with the host. If not, we deliver a SIGBUS to the current
> > task. The reason for enforcing this at fault() is that mmap()
> > does not elevate the pagecount(); it's the faulting in of the
> > page which does. Even if we were to check at mmap() whether an
> > address can be mapped, we would still need to check again on
> > fault(), since between mmap() and fault() the status of the page
> > can change.
> >
> > This creates the situation where access to successfully mmap()'d
> > memory might SIGBUS at page fault. There is precedence for
> > similar behavior in the kernel I believe, with MADV_HWPOISON and
> > the hugetlbfs cgroups controller, which could SIGBUS at page
> > fault time depending on the accounting limit.
>
> I added a test: folio_mmapped() [1] which checks if there's a vma
> covering the corresponding offset into the guest_memfd. I use this
> test before trying to make page private to guest and I've been able to
> ensure that userspace can't even mmap() private guest memory. If I try
> to make memory private, I can test that it's not mmapped and not allow
> memory to become private. In my testing so far, this is enough to
> prevent SIGBUS from happening.
>
> This test probably should be moved outside Gunyah specific code, and was
> looking for maintainer to suggest the right home for it :)
>
> [1]: https://lore.kernel.org/all/20240222-gunyah-v17-20-1e9da6763d38@xxxxxxxxxxx/

Let's see what the mm-folks think about this [*].

Thanks!
/fuad

[*] https://lore.kernel.org/all/ZdfoR3nCEP3HTtm1@xxxxxxxxxxxxxxxxxxxx/


> >
> > Another pKVM specific aspect we would like feedback on, is how to
> > handle memory mapped by the host being unshared by a guest. The
> > approach we've taken is that on an unshare call from the guest,
> > the host userspace is notified that the memory has been unshared,
> > in order to allow it to unmap it and mark it as PRIVATE as
> > acknowledgment. If the host does not unmap the memory, the
> > unshare call issued by the guest fails, which the guest is
> > informed about on return.
> >
> > Cheers,
> > /fuad
> >
> > [1] https://lore.kernel.org/all/20231105163040.14904-1-pbonzini@xxxxxxxxxx/
> >
> > [2] https://android-kvm.googlesource.com/linux/+/refs/heads/for-upstream/pkvm-core
> >
> > [3] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.8-rfc-v1
> >
> > [4] https://android-kvm.googlesource.com/kvmtool/+/refs/heads/tabba/guestmem-6.8
> >
> > [5] Protected KVM on arm64 (slides)
> > https://static.sched.com/hosted_files/kvmforum2022/88/KVM%20forum%202022%20-%20pKVM%20deep%20dive.pdf
> >
> > [6] Protected KVM on arm64 (video)
> > https://www.youtube.com/watch?v=9npebeVFbFw
> >
> > [7] Supporting guest private memory in Protected KVM on Android (presentation)
> > https://lpc.events/event/17/contributions/1487/
> >
> > [8] Drivers for Gunyah (patch series)
> > https://lore.kernel.org/all/20240109-gunyah-v16-0-634904bf4ce9@xxxxxxxxxxx/
>
> As of 5 min ago when I send this, there's a v17:
> https://lore.kernel.org/all/20240222-gunyah-v17-0-1e9da6763d38@xxxxxxxxxxx/
>
> Thanks,
> Elliot
>





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux