On Thu, 22 Aug 2019 12:11:55 +0100 Steven Price <steven.price@xxxxxxx> wrote: > On 22/08/2019 11:57, Jonathan Cameron wrote: > > On Wed, 21 Aug 2019 16:36:53 +0100 > > Steven Price <steven.price@xxxxxxx> wrote: > > > >> Allow user space to inform the KVM host where in the physical memory > >> map the paravirtualized time structures should be located. > >> > >> A device is created which provides the base address of an array of > >> Stolen Time (ST) structures, one for each VCPU. There must be (64 * > >> total number of VCPUs) bytes of memory available at this location. > >> > >> The address is given in terms of the physical address visible to > >> the guest and must be page aligned. The guest will discover the address > >> via a hypercall. > >> > >> Signed-off-by: Steven Price <steven.price@xxxxxxx> > > > > Hi Steven, > > > > One general question inline. I'm not particularly familiar with this area > > of the kernel, so maybe I'm missing something obvious, but having > > .destroy free the kvm_device which wasn't created in .create seems > > 'unusual'. > > > > Otherwise, FWIW looks good to me. > > > > Jonathan > > > [...] > >> +static void kvm_arm_pvtime_destroy(struct kvm_device *dev) > >> +{ > >> + struct kvm_arch_pvtime *pvtime = &dev->kvm->arch.pvtime; > >> + > >> + pvtime->st_base = GPA_INVALID; > >> + kfree(dev); > > > > Nothing to do with your patch as such... All users do the same. > > > > This seems miss balanced. Why do we need to free the device by hand > > when we didn't create it in the create function? I appreciate > > the comments say this is needed, but as far as I can see every > > single callback does kfree(dev) at the end which seems an > > odd thing to do. > > Yes I think this is odd too - indeed when I initially wrote this I > missed off the kfree() call and had to track down the memory leak. > > When I looked into potentially tiding this up I found some other > oddities, e.g. "kvm-xive" (arch/powerpc/kvm/book3s_xive.c) doesn't have > a destroy callback. But I can't see anything in the common code which > deals with that case. So I decided to just "go with the flow" at the > moment, since I don't understand how some of these existing devices work > (perhaps they are already broken?). It has a release however and kvm_device_release also removes the device from the list that would then be cleared by kvm_destroy_devices. kvm_device_release is a release callback for the file operations so it 'might' be called in all paths. Fun though, in kvm_ioctl_create_device the error handling for the anon_inode_getfd calls ops->destroy without checking it exists. Boom. Possibly never happens in reality but looks like a bug to me. Jonathan > > Steve > > >> +} > >> + > >> +static int kvm_arm_pvtime_set_attr(struct kvm_device *dev, > >> + struct kvm_device_attr *attr) > >> +{ > >> + struct kvm *kvm = dev->kvm; > >> + struct kvm_arch_pvtime *pvtime = &kvm->arch.pvtime; > >> + u64 __user *user = (u64 __user *)attr->addr; > >> + struct kvm_dev_arm_st_region region; > >> + > >> + switch (attr->group) { > >> + case KVM_DEV_ARM_PV_TIME_REGION: > >> + if (copy_from_user(®ion, user, sizeof(region))) > >> + return -EFAULT; > >> + if (region.gpa & ~PAGE_MASK) > >> + return -EINVAL; > >> + if (region.size & ~PAGE_MASK) > >> + return -EINVAL; > >> + switch (attr->attr) { > >> + case KVM_DEV_ARM_PV_TIME_ST: > >> + if (pvtime->st_base != GPA_INVALID) > >> + return -EEXIST; > >> + pvtime->st_base = region.gpa; > >> + pvtime->st_size = region.size; > >> + return 0; > >> + } > >> + break; > >> + } > >> + return -ENXIO; > >> +} > >> + > >> +static int kvm_arm_pvtime_get_attr(struct kvm_device *dev, > >> + struct kvm_device_attr *attr) > >> +{ > >> + struct kvm_arch_pvtime *pvtime = &dev->kvm->arch.pvtime; > >> + u64 __user *user = (u64 __user *)attr->addr; > >> + struct kvm_dev_arm_st_region region; > >> + > >> + switch (attr->group) { > >> + case KVM_DEV_ARM_PV_TIME_REGION: > >> + switch (attr->attr) { > >> + case KVM_DEV_ARM_PV_TIME_ST: > >> + region.gpa = pvtime->st_base; > >> + region.size = pvtime->st_size; > >> + if (copy_to_user(user, ®ion, sizeof(region))) > >> + return -EFAULT; > >> + return 0; > >> + } > >> + break; > >> + } > >> + return -ENXIO; > >> +} > >> + > >> +static int kvm_arm_pvtime_has_attr(struct kvm_device *dev, > >> + struct kvm_device_attr *attr) > >> +{ > >> + switch (attr->group) { > >> + case KVM_DEV_ARM_PV_TIME_REGION: > >> + switch (attr->attr) { > >> + case KVM_DEV_ARM_PV_TIME_ST: > >> + return 0; > >> + } > >> + break; > >> + } > >> + return -ENXIO; > >> +} > >> + > >> +static const struct kvm_device_ops pvtime_ops = { > >> + "Arm PV time", > >> + .create = kvm_arm_pvtime_create, > >> + .destroy = kvm_arm_pvtime_destroy, > >> + .set_attr = kvm_arm_pvtime_set_attr, > >> + .get_attr = kvm_arm_pvtime_get_attr, > >> + .has_attr = kvm_arm_pvtime_has_attr > >> +}; > >> + > >> +void kvm_pvtime_init(void) > >> +{ > >> + kvm_register_device_ops(&pvtime_ops, KVM_DEV_TYPE_ARM_PV_TIME); > >> +} > > > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > >