On 01/26/2016 05:30 AM, Alex Williamson wrote: > [cc +Neo @Nvidia] > > Hi Jike, > > On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote: >> On 01/20/2016 05:05 PM, Tian, Kevin wrote: >>> I would expect we can spell out next level tasks toward above >>> direction, upon which Alex can easily judge whether there are >>> some common VFIO framework changes that he can help :-) >> >> Hi Alex, >> >> Here is a draft task list after a short discussion w/ Kevin, >> would you please have a look? >> >> Bus Driver >> >> { in i915/vgt/xxx.c } >> >> - define a subset of vfio_pci interfaces >> - selective pass-through (say aperture) >> - trap MMIO: interface w/ QEMU > > What's included in the subset? Certainly the bus reset ioctls really > don't apply, but you'll need to support the full device interface, > right? That includes the region info ioctl and access through the vfio > device file descriptor as well as the interrupt info and setup ioctls. > [All interfaces I thought are via ioctl:) For other stuff like file descriptor we'll definitely keep it.] The list of ioctl commands provided by vfio_pci: - VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET As you said, above 2 don't apply. But for this: - VFIO_DEVICE_RESET In my opinion it should be kept, no matter what will be provided in the bus driver. - VFIO_PCI_ROM_REGION_INDEX - VFIO_PCI_VGA_REGION_INDEX I suppose above 2 don't apply neither? For a vgpu we don't provide a ROM BAR or VGA region. - VFIO_DEVICE_GET_INFO - VFIO_DEVICE_GET_REGION_INFO - VFIO_DEVICE_GET_IRQ_INFO - VFIO_DEVICE_SET_IRQS Above 4 are needed of course. We will need to extend: - VFIO_DEVICE_GET_REGION_INFO a) adding a flag: DONT_MAP. For example, the MMIO of vgpu should be trapped instead of being mmap-ed. b) adding other information. For example, for the OpRegion, QEMU need to do more than mmap a region, it has to: - allocate a region - copy contents from somewhere in host to that region - mmap it to guest I remember you already have a prototype for this? >> IOMMU >> >> { in a new vfio_xxx.c } >> >> - allocate: struct device & IOMMU group > > It seems like the vgpu instance management would do this. > Yes, it can be removed from here. >> - map/unmap functions for vgpu >> - rb-tree to maintain iova/hpa mappings > > Yep, pretty much what type1 does now, but without mapping through the > IOMMU API. Essentially just a database of the current userspace > mappings that can be accessed for page pinning and IOVA->HPA > translation. > Yes. >> - interacts with kvmgt.c >> >> >> vgpu instance management >> >> { in i915 } >> >> - path, create/destroy >> > > Yes, and since you're creating and destroying the vgpu here, this is > where I'd expect a struct device to be created and added to an IOMMU > group. The lifecycle management should really include links between > the vGPU and physical GPU, which would be much, much easier to do with > struct devices create here rather than at the point where we start > doing vfio "stuff". > Yes, just like the SRIOV does. > Nvidia has also been looking at this and has some ideas how we might > standardize on some of the interfaces and create a vgpu framework to > help share code between vendors and hopefully make a more consistent > userspace interface for libvirt as well. I'll let Neo provide some > details. Thanks, Good to know that, so we can possibly cooperate on some common part, e.g. the instance management :) > > Alex > -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html