On 10/11/2016 05:47 PM, Paolo Bonzini wrote: > > > On 11/10/2016 11:21, Xiao Guangrong wrote: >> >> >> On 10/11/2016 04:54 PM, Paolo Bonzini wrote: >>> >>> >>> On 11/10/2016 04:39, Xiao Guangrong wrote: >>>> >>>> >>>> On 10/11/2016 02:32 AM, Paolo Bonzini wrote: >>>>> >>>>> >>>>> On 10/10/2016 20:01, Neo Jia wrote: >>>>>>> Hi Neo, >>>>>>> >>>>>>> AFAIK this is needed because KVMGT doesn't paravirtualize the PPGTT, >>>>>>> while nVidia does. >>>>>> >>>>>> Hi Paolo and Xiaoguang, >>>>>> >>>>>> I am just wondering how device driver can register a notifier so he >>>>>> can be >>>>>> notified for write-protected pages when writes are happening. >>>>> >>>>> It can't yet, but the API is ready for that. kvm_vfio_set_group is >>>>> currently where a struct kvm_device* and struct vfio_group* touch. >>>>> Given >>>>> a struct kvm_device*, dev->kvm provides the struct kvm to be passed to >>>>> kvm_page_track_register_notifier. So I guess you could add a callback >>>>> that passes the struct kvm_device* to the mdev device. >>>>> >>>>> Xiaoguang and Guangrong, what were your plans? We discussed it briefly >>>>> at KVM Forum but I don't remember the details. >>>> >>>> Your suggestion was that pass kvm fd to KVMGT via VFIO, so that we can >>>> figure out the kvm instance based on the fd. >>>> >>>> We got a new idea, how about search the kvm instance by mm_struct, it >>>> can work as KVMGT is running in the vcpu context and it is much more >>>> straightforward. >>> >>> Perhaps I didn't understand your suggestion, but the same mm_struct can >>> have more than 1 struct kvm so I'm not sure that it can work. >> >> vcpu->pid is valid during vcpu running so that it can be used to figure >> out which kvm instance owns the vcpu whose pid is the one as current >> thread, i think it can work. :) > > No, don't do that. There's no reason for a thread to run a single VCPU, > and if you can have multiple VCPUs you can also have multiple VCPUs from > multiple VMs. > > Passing file descriptors around are the right way to connect subsystems. [CC Alex, Kevin and Qemu-devel] Hi Paolo & Alex, IIUC, passing file descriptors means touching QEMU and the UAPI between QEMU and VFIO. Would you guys have a look at below draft patch? If it's on the correct direction, I'll send the split ones. Thanks! -- Thanks, Jike diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index bec694c..f715d37 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -10,12 +10,14 @@ * the COPYING file in the top-level directory. */ +#include <sys/ioctl.h> #include "qemu/osdep.h" #include "qemu/error-report.h" #include "qemu/range.h" #include "qapi/error.h" #include "hw/nvram/fw_cfg.h" #include "pci.h" +#include "sysemu/kvm.h" #include "trace.h" /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ @@ -1844,3 +1846,15 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev) break; } } + +void vfio_quirk_kvmgt(VFIOPCIDevice *vdev) +{ + int vmfd; + + if (!kvm_enabled() || !vdev->kvmgt) + return; + + /* Tell the device what KVM it attached */ + vmfd = kvm_get_vmfd(kvm_state); + ioctl(vdev->vbasedev.fd, VFIO_SET_KVMFD, vmfd); +} diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index a5a620a..8732552 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2561,6 +2561,8 @@ static int vfio_initfn(PCIDevice *pdev) return ret; } + vfio_quirk_kvmgt(vdev); + /* Get a copy of config space */ ret = pread(vdev->vbasedev.fd, vdev->pdev.config, MIN(pci_config_size(&vdev->pdev), vdev->config_size), @@ -2832,6 +2834,7 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, sub_device_id, PCI_ANY_ID), DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), + DEFINE_PROP_BOOL("kvmgt", VFIOPCIDevice, kvmgt, false), /* * TODO - support passed fds... is this necessary? * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 7d482d9..813832c 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -143,6 +143,7 @@ typedef struct VFIOPCIDevice { bool no_kvm_intx; bool no_kvm_msi; bool no_kvm_msix; + bool kvmgt; } VFIOPCIDevice; uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len); @@ -166,4 +167,6 @@ int vfio_populate_vga(VFIOPCIDevice *vdev); int vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev, struct vfio_region_info *info); +void vfio_quirk_kvmgt(VFIOPCIDevice *vdev); + #endif /* HW_VFIO_VFIO_PCI_H */ diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index df67cc0..dd8320a 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -254,6 +254,7 @@ void phys_mem_set_alloc(void *(*alloc)(size_t, uint64_t *align)); int kvm_ioctl(KVMState *s, int type, ...); int kvm_vm_ioctl(KVMState *s, int type, ...); +int kvm_get_vmfd(KVMState *s); int kvm_vcpu_ioctl(CPUState *cpu, int type, ...); diff --git a/kvm-all.c b/kvm-all.c index efb5fe3..bd72ce3 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -2065,6 +2065,11 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) return ret; } +int kvm_get_vmfd(KVMState *s) +{ + return s->vmfd; +} + int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) { int ret; diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h index 759b850..952303f 100644 --- a/linux-headers/linux/vfio.h +++ b/linux-headers/linux/vfio.h @@ -686,6 +686,12 @@ struct vfio_iommu_spapr_tce_remove { }; #define VFIO_IOMMU_SPAPR_TCE_REMOVE _IO(VFIO_TYPE, VFIO_BASE + 20) + +/** + * VFIO_SET_KVMFD - _IO(VFIO_TYPE, VFIO_BASE + 21, __u32) + */ +#define VFIO_SET_KVMFD _IO(VFIO_TYPE, VFIO_BASE + 21) + /* ***************************************************************** */ #endif /* VFIO_H */ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html