Re: [RFC v2] libvirt vGPU QEMU integration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





>>>>> My concern is that a type id seems arbitrary but we're specifying that
>>>>> it be unique.  We already have something unique, the name.  So why try
>>>>> to make the type id unique as well?  A vendor can accidentally create
>>>>> their vendor driver so that a given name means something very
>>>>> specific.  On the other hand they need to be extremely deliberate to
>>>>> coordinate that a type id means a unique thing across all their product
>>>>> lines.
>>>>>     
>>>>
>>>> Let me clarify, type id should be unique in the list of
>>>> mdev_supported_types. You can't have 2 directories in with same name.  
>>>
>>> Of course, but does that mean it's only unique to the machine I'm
>>> currently running on?  Let's say I have a Tesla P100 on my system and
>>> type-id 11 is named "GRID-M60-0B".  At some point in the future I
>>> replace the Tesla P100 with a Q1000 (made up).  Is type-id 11 on that
>>> new card still going to be a "GRID-M60-0B"?  If not then we've based
>>> our XML on the wrong attribute.  If the new device does not support
>>> "GRID-M60-0B" then we should generate an error, not simply initialize
>>> whatever type-id 11 happens to be on this new card.
>>>   
>>
>> If there are 2 M60 in the system then you would find '11' type directory
>> in mdev_supported_types of both M60. If you have P100, '11' type would
>> not be there in its mdev_supported_types, it will have different types.
>>
>> For example, if you replace M60 with P100, but XML is not updated. XML
>> have type '11'. When libvirt would try to create mdev device, libvirt
>> would have to find 'create' file in sysfs in following directory format:
>>
>>  --- mdev_supported_types
>>      |-- 11
>>      |   |-- create
>>
>> but now for P100, '11' directory is not there, so libvirt should throw
>> error on not able to find '11' directory.
> 
> This really seems like an accident waiting to happen.  What happens
> when the user replaces their M60 with an Intel XYZ device that happens
> to expose a type 11 mdev class gpu device?  How is libvirt supposed to
> know that the XML used to refer to a GRID-M60-0B and now it's an
> INTEL-IGD-XYZ?  Doesn't basing the XML entry on the name and removing
> yet another arbitrary requirement that we have some sort of globally
> unique type-id database make a lot of sense?  The same issue applies
> for simple debug-ability, if I'm reviewing the XML for a domain and the
> name is the primary index for the mdev device, I know what it is.
> Seeing type-id='11' is meaningless.
>

Let me clarify again, type '11' is a string that vendor driver would
define (see my previous reply below) it could be "11" or "GRID-M60-0B".
If 2 vendors used same string we can't control that. right?


>>>> Lets remove 'id' from type id in XML if that is the concern. Supported
>>>> types is going to be defined by vendor driver, so let vendor driver
>>>> decide what to use for directory name and same should be used in device
>>>> xml file, it could be '11' or "GRID M60-0B":
>>>>
>>>>     <device>
>>>>       <name>my-vgpu</name>
>>>>       <parent>pci_0000_86_00_0</parent>
>>>>       <capability type='mdev'>
>>>>         <type='11'/>
>>>>         ...
>>>>       </capability>
>>>>     </device>
>>>>
>>>>  

Kirti

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]