On 20/11/2024 13:55, Paolo Bonzini wrote:
Patch 4 allows to call the ioctl from a separate (non-VMM) process. It
has been prohibited by [3], but I have not been able to locate the exact
justification for the requirement.
The justification is that the "struct kvm" has a long-lived tie to a
host process's address space.
Invoking ioctls like KVM_SET_USER_MEMORY_REGION and KVM_RUN from
different processes would make things very messy, because it is not
clear which mm you are working with: the MMU notifier is registered for
kvm->mm, but some functions such as get_user_pages do not take an mm for
example and always operate on current->mm.
That's fair, thanks for the explanation.
In your case, it should be enough to add a ioctl on the guestmemfd
instead?
That's right. That would be sufficient indeed. Is that something that
could be considered? Would that be some non-KVM API, with guest_memfd
moving to an mm library?
> But the real question is, what are you using
> KVM_X86_SW_PROTECTED_VM for?
The concrete use case is VM restoration from a snapshot in Firecracker
[1]. In the current setup, the VMM registers a UFFD against the guest
memory and sends the UFFD handle to an external process that knows how
to obtain the snapshotted memory. We would like to preserve the
semantics, but also remove the guest memory from the direct map [2].
Mimicing this with guest_memfd would be sending some form of a
guest_memfd handle to that process that would be using it to populate
guest_memfd.
[1]:
https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/handling-page-faults-on-snapshot-resume.md#userfaultfd
[2]:
https://lore.kernel.org/kvm/20241030134912.515725-1-roypat@xxxxxxxxxxxx/T/
Paolo