Re: [libvirt PATCH v2 12/15] conf: support manually specifying VFIO variant driver in <hostdev> XML

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 18, 2023 at 01:25:02 -0500, Laine Stump wrote:
> On 11/28/23 9:58 AM, Peter Krempa wrote:
> > On Mon, Nov 06, 2023 at 02:38:57 -0500, Laine Stump wrote:
> > > This patch makes it possible to manually specify which VFIO variant
> > > driver to use for PCI hostdev device assignment, so that, e.g. you
> > > could force use of the generic vfio-pci driver with
> > > 
> > >    <driver name='vfio-pci'/>
> > > 
> > > when libvirt would have normally (after applying a subsequent patch)
> > > found a "better match" for a device in the active kernel's
> > > modules.alias file. (The main use of this manual override would be to
> > > work around a bug in a new VFIO variant driver by temporarily not
> > > using that driver).
> > > 
> > > This sounds like a simple addition to XML parsing/formatting, but of
> > > course it's messier than that, since the attribute we want to use was
> > > previously used in config for a now non-existent purpose (choosing a
> > > type of device assignment that was removed from the kernel nearly a
> > > decade ago), and continues to be used *internal to the C code*.
> > > 
> > > Background:
> > > 
> > > Beginning when VFIO device assignment support was added to <hostdev>
> > > or <interface type='hostdev'/>) in libvirt's QEMU driver, it was
> > > possible to specify which device assignment backend to use with
> > > "<driver name='vfio|kvm'/>". This was only useful for a couple of
> > > years in the early 2010's when VFIO device assignment was newly
> > > introduced, but "legacy KVM" device assignment was still possible (in
> > > reality "<driver name='blah'/>" was only intended to be used (1) in
> > > the *very* early days of VFIO when "kvm" was still the default, or (2)
> > > when the host kernel was too old to have VFIO support (meaning it was
> > > e.g. pre-RHEL7) or (3) some bug was encountered with the then-new VFIO
> > > code that could have been avoided by switching back to the older style
> > > of device assignment (I don't recall anyone ever needing to set it for
> > > this reason, but that is one of the main reasons we added the knob).
> > > 
> > > Later, when the libxl (Xen) driver in libvirt began supporting "device
> > > passthrough" with <hostdev>, they added <driver name='xen'/> to
> > > indicate that mode of operation. But in practice this was also never
> > > necessary in any config, since Xen only supports one type of device
> > > assignment, and so the attribute was anyway set in the C object by the
> > > libxl driver code prior to calling any hypervisor-common code
> > > (i.e. the stuff in hypervisor/hostdev.c and util/virpci.c) that needs
> > > to know which type of device assignment is being used - setting it in
> > > the XML config was superfluous (kind of like me saying "I am now
> > > describing this patch in a human language", ref: Perd Hapley on "Parks
> > > and Recreation").
> > > 
> > > So the setting was available in the XML, but never needed to be set by
> > > the user. Because it was automatically set in the C code though,
> > > sometimes libvirt-generated XML would contain the option even though
> > > the user hadn't included it in the original input XML (e.g. the libxl
> > > driver sets it in the PostParse, so all saved configs with a PCI
> > > <hostdev> device will have <driver name='xen'/>; also status XML from
> > > the QEMU and libxl drivers will contain <driver name='vfio|xen'/> - in
> > > both cases completely unnecessary and redundant information).
> > > 
> > > When adding support for specifying a variant driver for VFIO device
> > > assignment, from the beginning I have wanted to use <driver
> > > name='blah'/> to force a specific variant driver (e.g. <driver
> > > name='mlx5_vfio_pci'/>), with the assumption that the name attribute
> > > could be easily repurposed because it had been *completely* unused for
> > > so long. What I discovered though, was that 1) the field in the C
> > > object corresponding to driver name was an enum value rather than a
> > > string (so parser and formatter need to be changed, not just the
> > > driver code looking at a string in the C object), and 2) even though
> > > the XML attribute was effectively unused (except in some output), the
> > > field in the C object was *always* being set internally as a way for
> > > the hypervisor driver code to inform the hypervisor-common hostdev
> > > manager (in src/hypervisor) which method of device assignment to
> > > use. So re-use wasn't as simple as I'd wished.
> > > 
> > > However.
> > > 
> > > What I have hit upon that permits us to use
> > > <driver name='mlx5_vfio_pci'/> is to do the following:
> > > 
> > > 1) rename the existing driver name attribute to driver *type* - this
> > >     will now be the enum that is set internally by the hypervisor
> > >     driver prior to calling the hostdev manager (and also is what will
> > >     show up in status XML; the libxl driver was modified in a previous
> > >     patch so that the setting isn't done until domain runtime (rather
> > >     than during PostParse), so it will no longer show up in
> > >     regurgitated libxml config)
> > > 
> > > 2) add a new attribute with the now-newly-unused name "name" - this
> > >     will be a string that can be used to communicate a specific host
> > >     kernel driver to the hostdev manager.
> > > 
> > > 3) In the parsing code for <driver>, if <driver name='vfio|xen|kvm'/>
> > >     is encountered, set the driver *type* to the appropriate enum
> > >     instead, and clear our the name string. (since the four places
> > >     that use the hostdev <driver> element now all call a common parser
> > >     function, this check is only needed in a single place)
> > > 
> > > I could alternately just leave "name" as-is, and create a new
> > > attribute with a never-before-used name within driver. I suppose
> > > instead of "type" and "name", we could instead have "name and "model"
> > > (where "model" would be the string that could be set to
> > > "mlx5_vfio_pci"). I just prefer type/name to name/model, and think
> > > it's been long enough since <driver name='blah'/> has had a useful
> > > purpose (and the fixup for backward compatibility is limited to a few
> > > lines in one function) that this change is safe.
> > 
> > I'm not entirely sold on the reusing of the 'name' attribute and this
> > paragraph seems to confirm that it's just for optics.
> > 
> > I'm a bit worried that some tool may be checking the output here however
> > useless it is and that tool might then need changing just because we
> > wanted "better looking" attribute names.
> 
> Sigh. Yeah, you're probably right.
> 
> Okay, so how about keeping <driver name='vfio|xen'> (but still minimizing
> its auto-generation as much as possible), and using
> <driver model='blah'/> when we want to specify a particular driver?

That sounds good. The 'name=' attribute should be present in the runtime
XML when it was selected, but based on your previous explanation we
don't need to keep putting it into the config at post-parse time.

If it was specified though, don't remove it.

> > > (another alternate would be to add an extra parameter to the C API
> > > that calls into the hostdev manager rather than keeping/adding the
> > > publicly visible "type" attribute, but it seems at least possible that
> > > sometime in the future another alternate "type" of device assignment
> > > could be introduced for either Xen or QEMU, and we would then need to
> > > add back the XML attribute anyway)
> > > 
> > > I'd be fine with doing it one of the other ways, but decided to post
> > > this way first, since it's the one that I find the most aesthetically
> > > pleasing :-)
> > 
> > Okay this paragraph is a bit too much of discussion for a commit
> > message. Please put stuff like this under the lines or as a RFC
> > question.
> 
> 
> Yeah, I meant to do that but forgot.
> 
> 
> > 
> > 
> > > Note that a few test cases have been modified - places where an
> > > existing "name='vfio'" was changed to "type='vfio'" were in in status
> > > XML or only the *output* XML for a test (except the case of the
> > > virnetworkportxml2xmltest, which doesn't have a separate directory for
> > > the XML result; fortunately the converged parsing of <driver> between
> > > domain/network/networkport means that the test cases for network and
> > > domain XML are already testing the same code that would convert "name"
> > > to "type" in thte networkport XML)
> > > 
> > > Signed-off-by: Laine Stump <laine@xxxxxxxxxx>
> > > ---
> > 
> > The code looks okay but I'm not a fan of the rename for cosmetical
> > reasons. Especially if we fiddle with the value in the field with
> > original name and display it back to the user/management sw.
> 
> I'll redo everything to use <driver model> and try to get it posted in the
> next day or two. If you or anyone else thinks of a better name let me know.
> 
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux