Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote:
> 
> Hi libvirt experts,
> 
> Thanks for valuable input on v1 version of RFC.
> 
> Quick brief, VFIO based mediated device framework provides a way to
> virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT
> and IBM's channel IO. This framework reuses VFIO APIs for all the
> functionalities for mediated devices which are currently being used for
> pass through devices. This framework introduces a set of new sysfs files
> for device creation and its life cycle management.
> 
> Here is the summary of discussion on v1:
> 1. Discover mediated device:
> As part of physical device initialization process, vendor driver will
> register their physical devices, which will be used to create virtual
> device (mediated device, aka mdev) to the mediated framework.
> 
> Vendor driver should specify mdev_supported_types in directory format.
> This format is class based, for example, display class directory format
> should be as below. We need to define such set for each class of devices
> which would be supported by mediated device framework.
> 
>  --- mdev_destroy
>  --- mdev_supported_types
>      |-- 11
>      |   |-- create
>      |   |-- name
>      |   |-- fb_length
>      |   |-- resolution
>      |   |-- heads
>      |   |-- max_instances
>      |   |-- params
>      |   |-- requires_group
>      |-- 12
>      |   |-- create
>      |   |-- name
>      |   |-- fb_length
>      |   |-- resolution
>      |   |-- heads
>      |   |-- max_instances
>      |   |-- params
>      |   |-- requires_group
>      |-- 13
>          |-- create
>          |-- name
>          |-- fb_length
>          |-- resolution
>          |-- heads
>          |-- max_instances
>          |-- params
>          |-- requires_group
> 
> 
> In the above example directory '11' represents a type id of mdev device.
> 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and
> 'requires_group' would be Read-Only files that vendor would provide to
> describe about that type.
> 
> 'create':
>     Write-only file. Mandatory.
>     Accepts string to create mediated device.
> 
> 'name':
>     Read-Only file. Mandatory.
>     Returns string, the name of that type id.

Presumably this is a human-targetted title/description of
the device.

> 
> 'fb_length':
>     Read-only file. Mandatory.
>     Returns <number>{K,M,G}, size of framebuffer.
> 
> 'resolution':
>     Read-Only file. Mandatory.
>     Returns 'hres x vres' format. Maximum supported resolution.
> 
> 'heads':
>     Read-Only file. Mandatory.
>     Returns integer. Number of maximum heads supported.

None of these should be mandatory as that makes the mdev
useless for non-GPU devices.

I'd expect to see a 'class' or 'type' attribute in the
directory whcih tells you what kind of mdev it is. A
valid 'class' value would be 'gpu'. The fb_length,
resolution, and heads parameters would only be mandatory
when class==gpu.

> 'max_instance':
>     Read-Only file. Mandatory.
>     Returns integer.  Returns maximum mdev device could be created
> at the moment when this file is read. This count would be updated by
> vendor driver. Before creating mdev device of this type, check if
> max_instance is > 0.
> 
> 'params'
>     Write-Only file. Optional.
>     String input. Libvirt would pass the string given in XML file to
> this file and then create mdev device. Set empty string to clear params.
> For example, set parameter 'frame_rate_limiter=0' to disable frame rate
> limiter for performance benchmarking, then create device of type 11. The
> device created would have that parameter set by vendor driver.

Nope, libvirt will explicitly *NEVER* allow arbitrary opaque
passthrough of vendor specific data in this way.

> The parent device would look like:
> 
>    <device>
>      <name>pci_0000_86_00_0</name>
>      <capability type='pci'>
>        <domain>0</domain>
>        <bus>134</bus>
>        <slot>0</slot>
>        <function>0</function>
>        <capability type='mdev'>
>          <!-- one type element per sysfs directory -->
>          <type id='11'>
>            <!-- one element per sysfs file roughly -->
>            <name>GRID M60-0B</name>
>            <attribute name='fb_length'>512M</attribute>
>            <attribute name='resolution'>2560x1600</attribute>
>            <attribute name='heads'>2</attribute>
>            <attribute name='max_instances'>16</attribute>
>            <attribute name='requires_group'>1</attribute>
>          </type>

There would need to be a <class> element, eg <class>gpu</class>

We would then have further elements based on the class. eg

          <type id='11'>
            <!-- one element per sysfs file roughly -->
            <name>GRID M60-0B</name>
            <fb_length>512M</fb_length>
            <resolution>2560x1600</resolution>
            <heads>2</heads>
            <max_instances>16</max_instances>
            <requires_group>1</requires_group>
          </type>



>        </capability>
>        <product id='...'>GRID M60</product>
>        <vendor id='0x10de'>NVIDIA</vendor>
>      </capability>
>    </device>
> 
> 2. Create/destroy mediated device
> 
> With above example, vGPU device XML would look like:
> 
>    <device>
>      <name>my-vgpu</name>
>      <parent>pci_0000_86_00_0</parent>
>      <capability type='mdev'>
>        <type id='11'/>
>        <group>1</group>
>        <params>'frame_rate_limiter=0'</params>

No, we will not support <params> in this manner in libvirt.

The entire purpose of libvirt is to represent data in a
vendor agnostic manner and not do abitrary passthrough
of vendor specific data. Simply saying this field is
optional does not get around that either.

>      </capability>
>    </device>
> 
> 'type id' is mandatory.
> 'group' is optional. It should be a unique number in the system among
> all the groups created for mdev devices. Its usage is:
>   - not needed if single vGPU device is being assigned to a domain.
>   - only need to be set if multiple vGPUs need to be assigned to a
> domain and vendor driver have 'requires_group' file in type id directory.
>   - if type id directory include 'requires_group' and user tries to
> assign multiple vGPUs to a domain without having <group> field in XML,
> it will create single vGPU.
> 
> 'params' is optional field. User should set this field if extra
> parameters need to be set for a particular vGPU device. Libvirt don't
> need to parse these params. These are meant for vendor driver.
> 
> Libvirt need to follow the sequence to create device:
> * Read /sys/../0000\:86\:00.0/11/max_instances. If it is greater than 0,
> then only proceed else fail.
> 
> * Set extra params if 'params' field exist in device XML and 'params'
> file exist in type id directory
> 
>     echo "frame_rate_limiter=0" > /sys/../0000\:86\:00.0/11/params

We cannot do that step.

> 
> * Autogenerate UUID
> * Create device:
> 
>     echo "$UUID:<group>" > /sys/../0000\:86\:00.0/11/create
> 
>     where <group> is optional. Group should be unique number among all
> the groups created for mdev devices.
> 
> * Clear params, if set earlier:
> 
>     echo "" > /sys/../0000\:86\:00.0/11/params
> 
> * To destroy device:
> 
>     echo $UUID > /sys/../0000\:86\:00.0/mdev_destroy
> 
> 
> 3. Start/stop mediated device
> 
> No change or requirement for libvirt as this will be handled by open()
> and close() callbacks to vendor driver. In case of multiple devices and
> 'requires_group' set, this will be handled in 'first open()' and 'last
> close()' on device in that group.
> 
> 4. Launch QEMU/VM
> 
>  Pass the mdev sysfs path to QEMU as vfio-pci device.
>  For above vGPU device example:
> 
>     -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$UUID
> 
> 5. QEMU/VM Shutdown sequence
> 
> No change or requirement for libvirt.
> 
> 6. VM Reset
> 
> No change or requirement for libvirt as this will be handled via VFIO
> reset API and QEMU process will keep running as before.
> 
> 7. Hot-plug
> 
> It is same syntax to create a virtual device for hot-plug.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]