Re: [RFC PATCH v1 1/1] vGPU core driver : to provide common interface for vGPU.

Kirti Wankhede <kwankhede@xxxxxxxxxx> · Tue, 2 Feb 2016 14:55:25 +0530

On 2/2/2016 1:12 PM, Tian, Kevin wrote:
From: Kirti Wankhede [mailto:kwankhede@xxxxxxxxxx]
Sent: Tuesday, February 02, 2016 9:48 AM

Resending this mail again, somehow my previous mail didn't reached every
to everyone's inbox.

On 2/2/2016 3:16 AM, Kirti Wankhede wrote:
Design for vGPU Driver:
Main purpose of vGPU driver is to provide a common interface for vGPU
management that can be used by differnt GPU drivers.

Thanks for composing this design which is a good start.

This module would provide a generic interface to create the device, add
it to vGPU bus, add device to IOMMU group and then add it to vfio group.

High Level block diagram:

+--------------+    vgpu_register_driver()+---------------+
|     __init() +------------------------->+               |
|              |                          |               |
|              +<-------------------------+    vgpu.ko    |
| vfio_vgpu.ko |   probe()/remove()       |               |
|              |                +---------+               +---------+
+--------------+                |         +-------+-------+         |
                                  |                 ^                 |
                                  | callback        |                 |
                                  |         +-------+--------+        |
                                  |         |vgpu_register_device()   |
                                  |         |                |        |
                                  +---^-----+-----+    +-----+------+-+
                                      | nvidia.ko |    |  i915.ko   |
                                      |           |    |            |
                                      +-----------+    +------------+

vGPU driver provides two types of registration interfaces:

Looks you missed callbacks which vgpu.ko provides to vfio_vgpu.ko,
e.g. to retrieve basic region/resource info, etc...

Basic region info or resource info would come from GPU driver. So 
retrieving such info should be the part of GPU driver interface. Like I 
mentioned we need to enhance these interfaces during development as and 
when we find it useful.
vfio_vgpu.ko gets dev pointer from which it can reach to vgpu_device 
structure and then it can use GPU driver interface directly to retrieve 
such information from GPU driver.

This RFC is to focus more on different modules and its structures, how 
those modules would be inter-linked with each other and have a flexible 
design to keep the scope for enhancements.

We have identified three modules:

* vgpu.ko - vGPU core driver that provide registration interface for GPU 
driver and vGPU VFIO  driver, responsible for creating vGPU devices and 
providing management interface for vGPU devices.
* vfio_vgpu.ko - vGPU VFIO driver for vGPU device, provides VFIO 
interface that is used by QEMU.
* vfio_iommu_type1_vgpu.ko - IOMMU TYPE1 driver supporting the IOMMU 
TYPE1 v1 and v2 interface.

The above block diagram gives an overview how vgpu.ko, vfio_vgpu.ko and 
GPU drivers would be inter-linked with each other.

Also for GPU driver interfaces, better to identify the caller. E.g. it's
easy to understand life-cycle management would come from sysfs
by mgmt. stack like libvirt. What about @read and @write? what's
the connection between this vgpu core driver and specific hypervisor?
etc. Better to connect all necessary dots so we can refine all
necessary requirements on this proposal.

read and write calls are for PCI CFG space amd MMIO space read/write. 
Read/write access request from QEMU is passed to GPU driver through GPU 
driver interface.

[...]

2. GPU driver interface

/**
   * struct gpu_device_ops - Structure to be registered for each physical
GPU to
   * register the device to vgpu module.
   *
   * @owner:              The module owner.
   * @vgpu_supported_config: Called to get information about supported
   *                       vgpu types.
   *                      @dev : pci device structure of physical GPU.
   *                      @config: should return string listing supported
   *                      config
   *                      Returns integer: success (0) or error (< 0)
   * @vgpu_create:        Called to allocate basic resouces in graphics
   *                      driver for a particular vgpu.
   *                      @dev: physical pci device structure on which vgpu
   *                            should be created
   *                      @uuid: uuid for which VM it is intended to
   *                      @instance: vgpu instance in that VM
   *                      @vgpu_id: This represents the type of vgpu to be
   *                                created
   *                      Returns integer: success (0) or error (< 0)

Specifically for Intel GVT-g we didn't hard partition resource among vGPUs.
Instead we allow user to accurately control how many physical resources
are allocated to a vGPU. So this interface should be extensible to allow
vendor specific resource control.

This interface forwards the create request to vendor/GPU driver 
informing about which physical GPU this request is intended for and the 
type of vGPU. Then its vendor/GPU driver's responsibility to do 
resources allocation and manage resources in their own driver.

And for UUID, I remember Alex had a concern on using it in kernel.
Honestly speaking I don't have a good idea here. In Xen side there is a VM ID
which can be easily used as the index. But for KVM, what would be the best
identifier to associate with a VM?

   * @vgpu_destroy:       Called to free resources in graphics driver for
   *                      a vgpu instance of that VM.
   *                      @dev: physical pci device structure to which
   *                      this vgpu points to.
   *                      @uuid: uuid for which the vgpu belongs to.
   *                      @instance: vgpu instance in that VM
   *                      Returns integer: success (0) or error (< 0)
   *                      If VM is running and vgpu_destroy is called that
   *                      means the vGPU is being hotunpluged. Return error
   *                      if VM is running and graphics driver doesn't
   *                      support vgpu hotplug.
   * @vgpu_start:         Called to do initiate vGPU initialization
   *                      process in graphics driver when VM boots before
   *                      qemu starts.
   *                      @uuid: UUID which is booting.
   *                      Returns integer: success (0) or error (< 0)
   * @vgpu_shutdown:      Called to teardown vGPU related resources for
   *                      the VM
   *                      @uuid: UUID which is shutting down .
   *                      Returns integer: success (0) or error (< 0)
   * @read:               Read emulation callback
   *                      @vdev: vgpu device structure
   *                      @buf: read buffer
   *                      @count: number bytes to read
   *                      @address_space: specifies for which address space

and suppose there'll be an 'offset' as required by usual emulation.

Yes, Sorry I missed that in this comment.

   *                      the request is: pci_config_space, IO register
   *                      space or MMIO space.
   *                      Retuns number on bytes read on success or error.
   * @write:              Write emulation callback
   *                      @vdev: vgpu device structure
   *                      @buf: write buffer
   *                      @count: number bytes to be written
   *                      @address_space: specifies for which address space
   *                      the request is: pci_config_space, IO register
   *                      space or MMIO space.
   *                      Retuns number on bytes written on success or
error.
   * @vgpu_set_irqs:      Called to send about interrupts configuration
   *                      information that qemu set.
   *                      @vdev: vgpu device structure
   *                      @flags, index, start, count and *data : same as
   *                      that of struct vfio_irq_set of
   *                      VFIO_DEVICE_SET_IRQS API.

any elaboration how this will be used in your case?

QEMU manage interrupts through VFIO_DEVICE_SET_IRQS ioctl, i.e. 
configuring, signaling, masking, and unmasking of interrupts. 
Description of VFIO_DEVICE_SET_IRQS ioctl in include/uapi/linux/vfio.h 
explains meaning of combination of parameter to VFIO driver.
In case of vGPU, interrupts are received by the owner of physical 
device, i.e. GPU driver. Then GPU driver knows for which vGPU device the 
interrupt is intended to. So interrupts related information should be 
available to GPU driver. This interface forwards interrupt related 
information from vfio_vgpu.ko to GPU driver and GPU driver should 
accordingly take necessary action.

   *
   * Physical GPU that support vGPU should be register with vgpu module with
   * gpu_device_ops structure.
   */

Also it'd be good design to allow extensible usages, such as statistics, and
other vendor specific control knobs (e.g. foreground/background VM switch
in Intel GVT-g, etc.)

Can you elaborate on what other control knobs that would be needed?

Thanks,
Kirti.

Thanks
Kevin

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html