Re: [PATCH 01/12] KVM: arm64: Document PV-time interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 28, 2018 at 02:45:16PM +0000, Steven Price wrote:
> We introduce a paravirtualization interface for KVM/arm64 based on the
> "Arm Paravirtualized Time for Arm-Base Systems" specification DEN 0057A.
> 
> User space can specify a reserved area of memory for the guest and
> inform KVM to populate the memory with information on stolen time and
> Live Physical Time (LPT) that can be used to derive a stable
> counter/timer for a guest subject to migration between hosts with
> different counter frequencies.
> 
> A hypercall interface is provided for the guest to interrogate the
> hypervisor's support for this interface and the location of the shared
> memory structures.
> 
> Signed-off-by: Steven Price <steven.price@xxxxxxx>
> ---
>  Documentation/virtual/kvm/arm/pvtime.txt | 169 +++++++++++++++++++++++
>  1 file changed, 169 insertions(+)
>  create mode 100644 Documentation/virtual/kvm/arm/pvtime.txt
> 
> diff --git a/Documentation/virtual/kvm/arm/pvtime.txt b/Documentation/virtual/kvm/arm/pvtime.txt
> new file mode 100644
> index 000000000000..1870b904075b
> --- /dev/null
> +++ b/Documentation/virtual/kvm/arm/pvtime.txt
> @@ -0,0 +1,169 @@
> +Paravirtualized time support for arm64
> +======================================
> +
> +Arm specification DEN0057/A defined a standard for paravirtualised time
> +support for Aarch64 guests:
> +
> +https://developer.arm.com/docs/den0057/a
> +
> +KVM/Arm64 implements this specification by providing some hypervisor service
> +calls to support a paravirtualized guest obtaining a view of the amount of
> +time stolen from its execution and a concept of Live Physical Time (LPT) which
> +represents time during which the guest is running and works across migrations.
> +
> +Three new SMCCC compatible hypercalls are defined:
> +
> +PV_FEATURES 0xC5000020
> +PV_TIME_LPT 0xC5000021
> +PV_TIME_ST  0xC5000022
> +
> +These are only available in the SMC64/HVC64 calling convention as
> +paravirtualized time is not available to 32 bit Arm guests.
> +
> +PV_FEATURES
> +    Function ID:  (uint32)  : 0xC5000020
> +    PV_func_id:   (uint32)  : Either PV_TIME_LPT or PV_TIME_ST
> +    Return value: (int32)   : NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant
> +                              PV-time feature is supported by the hypervisor.

What happens when a new guest tries this hypercall on an old host? Why not
return a bitmap of all supported features so the guest doesn't have to
do multiple hypercalls to build its bitmap?

> +
> +PV_TIME_LPT
> +    Function ID:  (uint32)  : 0xC5000021
> +    Flags:        (uint32)  : Bit[0]: Request migration interrupts
> +                                      (not currently supported by KVM)
> +    Return value: (int64)   : IPA of the shared live physical time data
> +                              structure or negative error code on failure:
> +                              NOT_SUPPORTED (-1)
> +                              INVALID_PARAMETERS (-2)
> +
> +PV_TIME_ST
> +    Function ID:  (uint32)  : 0xC5000022
> +    Return value: (int64)   : IPA of the stolen time data structure for this
> +                              (V)CPU. On failure:
> +                              NOT_SUPPORTED (-1)
> +
> +Live Physical Time
> +------------------
> +
> +The structure pointed to by the PV_TIME_LPT hypercall is as follows:
> +
> +  Field           | Byte Length | Byte Offset | Description
> +  --------------- | ----------- | ----------- | -------------------------------
> +  Revision        |      4      |      0      | Must be 0 for this revision
> +  Attributes      |      4      |      4      | Must be 0
> +  sequence_number |      8      |      8      | Bit 0: reserved
> +                  |             |             | Bits 1:63 number of migrations
> +  scale_mult      |      8      |      16     | Multiplier to scale from native
> +                  |             |             | frequency to PV frequency
> +  shift           |      4      |      24     | Shift applied before multiplier
> +  Reserved        |      4      |      28     | Must be 0
> +  Fn              |      8      |      32     | Native frequency
> +  Fpv             |      8      |      40     | Paravirtualized frequency seen
> +                  |             |             | by guest
> +  div_by_fpv_mult |      8      |      48     | Multiplier to implement fast
> +                  |             |             | divide by Fpv

Here's kvmclock's struct

 struct pvclock_vcpu_time_info {
        u32   version; 
        u32   pad0;
        u64   tsc_timestamp;
        u64   system_time;
        u32   tsc_to_system_mul;
        s8    tsc_shift;
        u8    flags;
        u8    pad[2];
 }

 Revision	 =>
 Attributes	 =>
 sequence_number => version
 scale_mult	 => tsc_to_system_mul	(this is reversed, but OK)
 shift		 => tsc_shift           (also reversed)
 Reserved	 =>
 Fn		 => (pvclock doesn't have, but does have system_time)
 Fpv		 =>
 div_by_fpv_mult =>

I haven't thought about this enough yet to be sure kvmclock's fields
are sufficient, but several look close - although the 'tsc' naming
isn't nice. Also, the pvclock struct could be extended by adding an
'extended' flag to 'flags', and then appending more fields.

> +
> +Where scale_mult is defined as 2^(64-shift) * Fpv / Fn
> +
> +The structure will be updated by the hypervisor whenever the guest is migrated
> +to a new host. It will be present within a reserved region of the normal
> +memory given to the guest. The guest should not attempt to write into this
> +memory.
> +
> +Stolen Time
> +-----------
> +
> +The structure pointed to by the PV_TIME_ST hypercall is as follows:
> +
> +  Field       | Byte Length | Byte Offset | Description
> +  ----------- | ----------- | ----------- | --------------------------
> +  Revision    |      4      |      0      | Must be 0 for version 0.1
> +  Attributes  |      4      |      4      | Must be 0
> +  Stolen time |      8      |      8      | Stolen time in unsigned
> +              |             |             | nanoseconds indicating how
> +              |             |             | much time this VCPU thread
> +              |             |             | was involuntarily not
> +              |             |             | running on a physical CPU.
> +
> +The structure will be updated by the hypervisor periodically as time is stolen
> +from the VCPU. It will be present within a reserved region of the normal
> +memory given to the guest. The guest should not attempt to write into this
> +memory. There is a structure by VCPU of the guest.
> +
> +User space interface
> +====================
> +
> +User space can request that KVM provide the paravirtualized time interface to
> +a guest by creating a KVM_DEV_TYPE_ARM_PV_TIME device, for example:
> +
> +    struct kvm_create_device pvtime_device = {
> +            .type = KVM_DEV_TYPE_ARM_PV_TIME,
> +            .attr = 0,
> +            .flags = 0,
> +    };
> +
> +    pvtime_fd = ioctl(vm_fd, KVM_CREATE_DEVICE, &pvtime_device);
> +
> +The guest IPA of the structures must be given to KVM. This is the address of
> +the LPT structure and the base address of an array of stolen time structures
> +(one for each VCPU). For example:
> +
> +    struct kvm_device_attr lpt_base = {
> +            .group = KVM_DEV_ARM_PV_TIME_PADDR,
> +            .attr = KVM_DEV_ARM_PV_TIME_LPT,
> +            .addr = (u64)(unsigned long)&lpt_paddr
> +    };
> +    struct kvm_device_attr st_base = {
> +            .group = KVM_DEV_ARM_PV_TIME_PADDR,
> +            .attr = KVM_DEV_ARM_PV_TIME_ST,
> +            .addr = (u64)(unsigned long)&st_paddr
> +    };
> +
> +    ioctl(pvtime_fd, KVM_SET_DEVICE_ATTR, &lpt_base);
> +    ioctl(pvtime_fd, KVM_SET_DEVICE_ATTR, &st_base);
> +
> +The paravirtualized frequency of the guest can also be set. By default this
> +will be the counter frequency of the host. However when migrating a guest from
> +another host, this must be manually set to ensure that the guest sees the same
> +frequency.
> +
> +    u32 frequency;
> +
> +    struct kvm_device_attr lpt_freq = {
> +            .group = KVM_DEV_ARM_PV_TIME_FREQUENCY,
> +            .attr = KVM_DEV_ARM_PV_TIME_LPT,
> +            .addr = (u64)(unsigned long)&frequency
> +    };
> +
> +    ioctl(pvtime_fd, KVM_SET_DEVICE_ATTR, &lpt_freq);
> +
> +For migration (or save/restore) of a guest it is necessary to save the contents
> +of the shared pages and later restore them. KVM_DEV_ARM_PV_TIME_STATE_SIZE
> +provides the size of this data and KVM_DEV_ARM_PV_TIME_STATE allows the state
> +to be read/written. The state for stolen time and LPT are accessed separately.

I guess there's no harm in userspace accessing these separately, but I
hope we'll be trying to put all the PV structures on one shared page
for the guest.

Thanks,
drew

> +It is also necessary for the physical address and frequency to be set
> +identically when restoring. The kernel will update the structure on first run
> +of the vCPU(s) to contain the new coefficients.
> +
> +    void *save_state(int fd, u64 attr, u32 *size) {
> +        struct kvm_device_attr get_size = {
> +                .group = KVM_DEV_ARM_PV_TIME_STATE_SIZE,
> +                .attr = attr,
> +                .addr = (u64)(unsigned long)size
> +        };
> +
> +        ioctl(fd, KVM_GET_DEVICE_ATTR, get_size);
> +
> +        void *buffer = malloc(*size);
> +
> +        struct kvm_device_attr get_state = {
> +                .group = KVM_DEV_ARM_PV_TIME_STATE,
> +                .attr = attr,
> +                .addr = (u64)(unsigned long)size
> +        };
> +
> +        ioctl(fd, KVM_GET_DEVICE_ATTR, buffer);
> +    }
> +
> +    void *lpt_state = save_state(pvtime_fd, KVM_DEV_ARM_PV_TIME_LPT, &lpt_size);
> +    void *st_state = save_state(pvtime_fd, KVM_DEV_ARM_PV_TIME_ST, &st_size);
> +
> -- 
> 2.19.2
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@xxxxxxxxxxxxxxxxxxxxx
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux