Hi Oliver,
On 3/24/22 1:28 AM, Oliver Upton wrote:
On Tue, Mar 22, 2022 at 04:07:06PM +0800, Gavin Shan wrote:
This supports ioctl commands on VM to manage the various objects.
It's primarily used by VMM to accomplish migration. The ioctl
commands introduced by this are highlighted as below:
* KVM_SDEI_CMD_GET_VERSION
Retrieve the version of current implementation. It's different
from the version of the followed SDEI specification. This version
is used to indicates what functionalities documented in the SDEI
specification have been supported or not supported.
Don't we need a way to set the version as well? KVM is very much
responsible for upholding ABI of older specs. So, if a VMM and guest
expect SDEI v1.1, we can't just forcibly raise it to something else
during a migration.
The PSCI implementation is a great example of how KVM has grown its
implementation in line with a specification, all the while preserving
backwards compatibility.
The only information feed by VMM is the exposed events. The events
can't be registered from guest kernel, and raised from host to guest
kernel until it's exposed by VMM. Besides, the exposed events will
be defined staticly in host/KVM as we discussed on PATCH[02/22]. We
also discussed to eliminate those ioctl commands. So I think we needn't
to add KVM_SDEI_CMD_SET_VERSION. Further more, the version is only a
concern to host itself if the migration can be done through the
firmware pseudo system registers since the migration compatibility
is the only concern to VMM (QEMU).
Yes, Currently, 0.1/0.2/1.0 versions are supported by PSCI. 0.1 is
picked until VMM asks for 0.2 and 1.0 explicitly. However, it seems
QEMU isn't using 1.0 PSCI yet and maybe more patch is needed to enable
it.
* KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT
Return the total count of exposed events.
* KVM_SDEI_CMD_GET_EXPOSED_EVENT
* KVM_SDEI_CMD_SET_EXPOSED_EVENT
Get or set exposed event
* KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT
Return the total count of registered events.
* KVM_SDEI_CMD_GET_REGISTERED_EVENT
* KVM_SDEI_CMD_SET_REGISTERED_EVENT
Get or set registered event.
Any new UAPI needs to be documented in Documentation/virt/kvm/api.rst
Additionally, we desperately need a better, generic way to save/restore
VM scoped state. IMO, we should only be adding ioctls if we are
affording userspace a meaningful interface. Every save/restore pair of
ioctls winds up wasting precious ioctl numbers and requires userspace
take a change to read/write an otherwise opaque value.
Marc had made some suggestions in this area already that Raghavendra
experimented with [1], and I think its time to meaningfully consider
our options. Basically, KVM_GET_REG_LIST needs to convey whether a
particular register is VM or vCPU state. We only need to save/restore a
VM state register once. That way, userspace doesn't have to care about
the underlying data and the next piece of VM state that comes along
doesn't require an ioctl nr nor VMM participation.
[1]: http://lore.kernel.org/r/20220224172559.4170192-3-rananta@xxxxxxxxxx
Thanks for the pointer to Raghavendra's series. The firmware pseudo
system registers have been classified into VM and VCPU scoped in the
series. I think it fits the SDEI migration requirements very well.
The shared events can even be migrated through the VM scoped firmware
pseudo system registers. However, I don't plan to support it in next
revision (v6) as currently needed events are all private. I may
spend more time to go through Raghavendra's series later.
Thanks,
Gavin
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm