On 25.04.2013, at 14:07, Gleb Natapov wrote: > On Thu, Apr 25, 2013 at 12:47:39PM +0200, Alexander Graf wrote: >> >> On 25.04.2013, at 11:43, Gleb Natapov wrote: >> >>> On Fri, Apr 12, 2013 at 07:08:42PM -0500, Scott Wood wrote: >>>> Currently, devices that are emulated inside KVM are configured in a >>>> hardcoded manner based on an assumption that any given architecture >>>> only has one way to do it. If there's any need to access device state, >>>> it is done through inflexible one-purpose-only IOCTLs (e.g. >>>> KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is >>>> cumbersome and depletes a limited numberspace. >>>> >>>> This API provides a mechanism to instantiate a device of a certain >>>> type, returning an ID that can be used to set/get attributes of the >>>> device. Attributes may include configuration parameters (e.g. >>>> register base address), device state, operational commands, etc. It >>>> is similar to the ONE_REG API, except that it acts on devices rather >>>> than vcpus. >>>> >>>> Both device types and individual attributes can be tested without having >>>> to create the device or get/set the attribute, without the need for >>>> separately managing enumerated capabilities. >>>> >>>> Signed-off-by: Scott Wood <scottwood@xxxxxxxxxxxxx> >>>> --- >>>> v4: >>>> - Move some boilerplate back into generic code, as requested by Gleb. >>>> File descriptor management and reference counting is no longer the >>>> concern of the device implementation. >>>> >>>> - Don't hold kvm->lock during create. The original reasons >>>> for doing so have vanished as for as MPIC is concerned, and >>>> this avoids needing to answer the question of whether to >>>> hold the lock during destroy as well. >>>> >>>> Paul, you may need to acquire the lock yourself in kvm_create_xics() >>>> to protect the -EEXIST check. >>>> >>>> v3: remove some changes that were merged into this patch by accident, >>>> and fix the error documentation for KVM_CREATE_DEVICE. >>>> --- >>>> Documentation/virtual/kvm/api.txt | 70 ++++++++++++++++ >>>> Documentation/virtual/kvm/devices/README | 1 + >>>> include/linux/kvm_host.h | 35 ++++++++ >>>> include/uapi/linux/kvm.h | 27 +++++++ >>>> virt/kvm/kvm_main.c | 129 ++++++++++++++++++++++++++++++ >>>> 5 files changed, 262 insertions(+) >>>> create mode 100644 Documentation/virtual/kvm/devices/README >>>> >>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt >>>> index 976eb65..d52f3f9 100644 >>>> --- a/Documentation/virtual/kvm/api.txt >>>> +++ b/Documentation/virtual/kvm/api.txt >>>> @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from the data >>>> written, then `n_invalid' invalid entries, invalidating any previously >>>> valid entries found. >>>> >>>> +4.79 KVM_CREATE_DEVICE >>>> + >>>> +Capability: KVM_CAP_DEVICE_CTRL >>>> +Type: vm ioctl >>>> +Parameters: struct kvm_create_device (in/out) >>>> +Returns: 0 on success, -1 on error >>>> +Errors: >>>> + ENODEV: The device type is unknown or unsupported >>>> + EEXIST: Device already created, and this type of device may not >>>> + be instantiated multiple times >>>> + >>>> + Other error conditions may be defined by individual device types or >>>> + have their standard meanings. >>>> + >>>> +Creates an emulated device in the kernel. The file descriptor returned >>>> +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. >>>> + >>>> +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the >>>> +device type is supported (not necessarily whether it can be created >>>> +in the current vm). >>>> + >>>> +Individual devices should not define flags. Attributes should be used >>>> +for specifying any behavior that is not implied by the device type >>>> +number. >>>> + >>>> +struct kvm_create_device { >>>> + __u32 type; /* in: KVM_DEV_TYPE_xxx */ >>>> + __u32 fd; /* out: device handle */ >>>> + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ >>>> +}; >>> Should we add __u32 padding here to make struct size multiple of u64? >> >> Do you know of any arch that pads structs to u64 boundaries? x86_64 doesn't and ppc64 doesn't either. >> > Not really. I just notices that we pad some structures to that effect. I don't think we really need to :). > >>> >>>> + >>>> +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR >>>> + >>>> +Capability: KVM_CAP_DEVICE_CTRL >>>> +Type: device ioctl >>>> +Parameters: struct kvm_device_attr >>>> +Returns: 0 on success, -1 on error >>>> +Errors: >>>> + ENXIO: The group or attribute is unknown/unsupported for this device >>>> + EPERM: The attribute cannot (currently) be accessed this way >>>> + (e.g. read-only attribute, or attribute that only makes >>>> + sense when the device is in a different state) >>>> + >>>> + Other error conditions may be defined by individual device types. >>>> + >>>> +Gets/sets a specified piece of device configuration and/or state. The >>>> +semantics are device-specific. See individual device documentation in >>>> +the "devices" directory. As with ONE_REG, the size of the data >>>> +transferred is defined by the particular attribute. >>>> + >>>> +struct kvm_device_attr { >>>> + __u32 flags; /* no flags currently defined */ >>>> + __u32 group; /* device-defined */ >>>> + __u64 attr; /* group-defined */ >>>> + __u64 addr; /* userspace address of attr data */ >>>> +}; >>>> + >>>> +4.81 KVM_HAS_DEVICE_ATTR >>>> + >>>> +Capability: KVM_CAP_DEVICE_CTRL >>>> +Type: device ioctl >>>> +Parameters: struct kvm_device_attr >>>> +Returns: 0 on success, -1 on error >>>> +Errors: >>>> + ENXIO: The group or attribute is unknown/unsupported for this device >>>> + >>>> +Tests whether a device supports a particular attribute. A successful >>>> +return indicates the attribute is implemented. It does not necessarily >>>> +indicate that the attribute can be read or written in the device's >>>> +current state. "addr" is ignored. >>>> >>>> 4.77 KVM_ARM_VCPU_INIT >>>> >>>> diff --git a/Documentation/virtual/kvm/devices/README b/Documentation/virtual/kvm/devices/README >>>> new file mode 100644 >>>> index 0000000..34a6983 >>>> --- /dev/null >>>> +++ b/Documentation/virtual/kvm/devices/README >>>> @@ -0,0 +1 @@ >>>> +This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL. >>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h >>>> index 20d77d2..8fce9bc 100644 >>>> --- a/include/linux/kvm_host.h >>>> +++ b/include/linux/kvm_host.h >>>> @@ -1063,6 +1063,41 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu) >>>> >>>> extern bool kvm_rebooting; >>>> >>>> +struct kvm_device_ops; >>>> + >>>> +struct kvm_device { >>>> + struct kvm_device_ops *ops; >>>> + struct kvm *kvm; >>>> + atomic_t users; >>>> + void *private; >>>> +}; >>>> + >>>> +/* create, destroy, and name are mandatory */ >>>> +struct kvm_device_ops { >>>> + const char *name; >>>> + int (*create)(struct kvm_device *dev, u32 type); >>>> + >>>> + /* >>>> + * Destroy is responsible for freeing dev. >>>> + * >>>> + * Destroy may be called before or after destructors are called >>>> + * on emulated I/O regions, depending on whether a reference is >>>> + * held by a vcpu or other kvm component that gets destroyed >>>> + * after the emulated I/O. >>>> + */ >>>> + void (*destroy)(struct kvm_device *dev); >>>> + >>>> + int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr); >>>> + int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr); >>>> + int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr); >>>> + long (*ioctl)(struct kvm_device *dev, unsigned int ioctl, >>>> + unsigned long arg); >>>> +}; >>>> + >>>> +void kvm_device_get(struct kvm_device *dev); >>>> +void kvm_device_put(struct kvm_device *dev); >>>> +struct kvm_device *kvm_device_from_filp(struct file *filp); >>>> + >>>> #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT >>>> >>>> static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val) >>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h >>>> index 74d0ff3..20ce2d2 100644 >>>> --- a/include/uapi/linux/kvm.h >>>> +++ b/include/uapi/linux/kvm.h >>>> @@ -668,6 +668,7 @@ struct kvm_ppc_smmu_info { >>>> #define KVM_CAP_PPC_EPR 86 >>>> #define KVM_CAP_ARM_PSCI 87 >>>> #define KVM_CAP_ARM_SET_DEVICE_ADDR 88 >>>> +#define KVM_CAP_DEVICE_CTRL 89 >>>> >>>> #ifdef KVM_CAP_IRQ_ROUTING >>>> >>>> @@ -909,6 +910,32 @@ struct kvm_s390_ucas_mapping { >>>> #define KVM_ARM_SET_DEVICE_ADDR _IOW(KVMIO, 0xab, struct kvm_arm_device_addr) >>>> >>>> /* >>>> + * Device control API, available with KVM_CAP_DEVICE_CTRL >>>> + */ >>>> +#define KVM_CREATE_DEVICE_TEST 1 >>>> + >>>> +struct kvm_create_device { >>>> + __u32 type; /* in: KVM_DEV_TYPE_xxx */ >>>> + __u32 fd; /* out: device handle */ >>>> + __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ >>>> +}; >>>> + >>>> +struct kvm_device_attr { >>>> + __u32 flags; /* no flags currently defined */ >>>> + __u32 group; /* device-defined */ >>>> + __u64 attr; /* group-defined */ >>>> + __u64 addr; /* userspace address of attr data */ >>>> +}; >>> Please move struct definitions and KVM_CREATE_DEVICE_TEST define out >>> from ioctl definition block. >> >> Let me change that in my tree... >> > So are you sending this via your tree and I should not apply it directly? I was hoping to have things ready very soon for you to just pull... Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html