On 23.01.25 10:48, Fuad Tabba wrote:
On Wed, 22 Jan 2025 at 22:10, David Hildenbrand <david@xxxxxxxxxx> wrote:
On 22.01.25 16:27, Fuad Tabba wrote:
Make kvm_(read|/write)_guest_page() capable of accessing guest
memory for slots that don't have a userspace address, but only if
the memory is mappable, which also indicates that it is
accessible by the host.
Interesting. So far my assumption was that, for shared memory, user
space would simply mmap() guest_memdd and pass it as userspace address
to the same memslot that has this guest_memfd for private memory.
Wouldn't that be easier in the first shot? (IOW, not require this patch
with the cost of faulting the shared page into the page table on access)
In light of:
https://lkml.kernel.org/r/20250117190938.93793-4-imbrenda@xxxxxxxxxxxxx
there can, in theory, be memslots that start at address 0 and have a
"valid" mapping. This case is done from the kernel (and on special s390x
hardware), though, so it does not apply here at all so far.
In practice, getting address 0 as a valid address is unlikely, because
the default:
$ sysctl vm.mmap_min_addr
vm.mmap_min_addr = 65536
usually prohibits it for good reason.
This has to do more with the ABI I had for pkvm and shared memory
implementations, in which you don't need to specify the userspace
address for memory in a guestmem memslot. The issue is there is no
obvious address to map it to. This would be the case in kvm:arm64 for
tracking paravirtualized time, which the userspace doesn't necessarily
need to interact with, but kvm does.
So I understand correctly: userspace wouldn't have to mmap it because it
is not interested in accessing it, but there is nothing speaking against
mmaping it, at least in the first shot.
I assume it would not be a private memslot (so far, my understanding is
that internal memslots never have a guest_memfd attached).
kvm_gmem_create() is only called via KVM_CREATE_GUEST_MEMFD, to be set
on user-created memslots.
That said, we could always have a userspace address dedicated to
mapping shared locations, and use that address when the necessity
arises. Or we could always require that memslots have a userspace
address, even if not used. I don't really have a strong preference.
So, the simpler version where user space would simply mmap guest_memfd
to provide the address via userspace_addr would at least work for the
use case of paravirtualized time?
It would get rid of the immediate need for this patch and patch #4 to
get it flying.
One interesting question is: when would you want shared memory in
guest_memfd and *not* provide it as part of the same memslot.
One nice thing about the mmap might be that access go via user-space
page tables: E.g., __kvm_read_guest_page can just access the memory
without requiring the folio lock and an additional temporary folio
reference on every access -- it's handled implicitly via the mapcount.
(of course, to map the page we still need that once on the fault path)
--
Cheers,
David / dhildenb