On Mon, Dec 18, 2023 at 01:25:02 -0500, Laine Stump wrote: > On 11/28/23 9:58 AM, Peter Krempa wrote: > > On Mon, Nov 06, 2023 at 02:38:57 -0500, Laine Stump wrote: > > > This patch makes it possible to manually specify which VFIO variant > > > driver to use for PCI hostdev device assignment, so that, e.g. you > > > could force use of the generic vfio-pci driver with > > > > > > <driver name='vfio-pci'/> > > > > > > when libvirt would have normally (after applying a subsequent patch) > > > found a "better match" for a device in the active kernel's > > > modules.alias file. (The main use of this manual override would be to > > > work around a bug in a new VFIO variant driver by temporarily not > > > using that driver). > > > > > > This sounds like a simple addition to XML parsing/formatting, but of > > > course it's messier than that, since the attribute we want to use was > > > previously used in config for a now non-existent purpose (choosing a > > > type of device assignment that was removed from the kernel nearly a > > > decade ago), and continues to be used *internal to the C code*. > > > > > > Background: > > > > > > Beginning when VFIO device assignment support was added to <hostdev> > > > or <interface type='hostdev'/>) in libvirt's QEMU driver, it was > > > possible to specify which device assignment backend to use with > > > "<driver name='vfio|kvm'/>". This was only useful for a couple of > > > years in the early 2010's when VFIO device assignment was newly > > > introduced, but "legacy KVM" device assignment was still possible (in > > > reality "<driver name='blah'/>" was only intended to be used (1) in > > > the *very* early days of VFIO when "kvm" was still the default, or (2) > > > when the host kernel was too old to have VFIO support (meaning it was > > > e.g. pre-RHEL7) or (3) some bug was encountered with the then-new VFIO > > > code that could have been avoided by switching back to the older style > > > of device assignment (I don't recall anyone ever needing to set it for > > > this reason, but that is one of the main reasons we added the knob). > > > > > > Later, when the libxl (Xen) driver in libvirt began supporting "device > > > passthrough" with <hostdev>, they added <driver name='xen'/> to > > > indicate that mode of operation. But in practice this was also never > > > necessary in any config, since Xen only supports one type of device > > > assignment, and so the attribute was anyway set in the C object by the > > > libxl driver code prior to calling any hypervisor-common code > > > (i.e. the stuff in hypervisor/hostdev.c and util/virpci.c) that needs > > > to know which type of device assignment is being used - setting it in > > > the XML config was superfluous (kind of like me saying "I am now > > > describing this patch in a human language", ref: Perd Hapley on "Parks > > > and Recreation"). > > > > > > So the setting was available in the XML, but never needed to be set by > > > the user. Because it was automatically set in the C code though, > > > sometimes libvirt-generated XML would contain the option even though > > > the user hadn't included it in the original input XML (e.g. the libxl > > > driver sets it in the PostParse, so all saved configs with a PCI > > > <hostdev> device will have <driver name='xen'/>; also status XML from > > > the QEMU and libxl drivers will contain <driver name='vfio|xen'/> - in > > > both cases completely unnecessary and redundant information). > > > > > > When adding support for specifying a variant driver for VFIO device > > > assignment, from the beginning I have wanted to use <driver > > > name='blah'/> to force a specific variant driver (e.g. <driver > > > name='mlx5_vfio_pci'/>), with the assumption that the name attribute > > > could be easily repurposed because it had been *completely* unused for > > > so long. What I discovered though, was that 1) the field in the C > > > object corresponding to driver name was an enum value rather than a > > > string (so parser and formatter need to be changed, not just the > > > driver code looking at a string in the C object), and 2) even though > > > the XML attribute was effectively unused (except in some output), the > > > field in the C object was *always* being set internally as a way for > > > the hypervisor driver code to inform the hypervisor-common hostdev > > > manager (in src/hypervisor) which method of device assignment to > > > use. So re-use wasn't as simple as I'd wished. > > > > > > However. > > > > > > What I have hit upon that permits us to use > > > <driver name='mlx5_vfio_pci'/> is to do the following: > > > > > > 1) rename the existing driver name attribute to driver *type* - this > > > will now be the enum that is set internally by the hypervisor > > > driver prior to calling the hostdev manager (and also is what will > > > show up in status XML; the libxl driver was modified in a previous > > > patch so that the setting isn't done until domain runtime (rather > > > than during PostParse), so it will no longer show up in > > > regurgitated libxml config) > > > > > > 2) add a new attribute with the now-newly-unused name "name" - this > > > will be a string that can be used to communicate a specific host > > > kernel driver to the hostdev manager. > > > > > > 3) In the parsing code for <driver>, if <driver name='vfio|xen|kvm'/> > > > is encountered, set the driver *type* to the appropriate enum > > > instead, and clear our the name string. (since the four places > > > that use the hostdev <driver> element now all call a common parser > > > function, this check is only needed in a single place) > > > > > > I could alternately just leave "name" as-is, and create a new > > > attribute with a never-before-used name within driver. I suppose > > > instead of "type" and "name", we could instead have "name and "model" > > > (where "model" would be the string that could be set to > > > "mlx5_vfio_pci"). I just prefer type/name to name/model, and think > > > it's been long enough since <driver name='blah'/> has had a useful > > > purpose (and the fixup for backward compatibility is limited to a few > > > lines in one function) that this change is safe. > > > > I'm not entirely sold on the reusing of the 'name' attribute and this > > paragraph seems to confirm that it's just for optics. > > > > I'm a bit worried that some tool may be checking the output here however > > useless it is and that tool might then need changing just because we > > wanted "better looking" attribute names. > > Sigh. Yeah, you're probably right. > > Okay, so how about keeping <driver name='vfio|xen'> (but still minimizing > its auto-generation as much as possible), and using > <driver model='blah'/> when we want to specify a particular driver? That sounds good. The 'name=' attribute should be present in the runtime XML when it was selected, but based on your previous explanation we don't need to keep putting it into the config at post-parse time. If it was specified though, don't remove it. > > > (another alternate would be to add an extra parameter to the C API > > > that calls into the hostdev manager rather than keeping/adding the > > > publicly visible "type" attribute, but it seems at least possible that > > > sometime in the future another alternate "type" of device assignment > > > could be introduced for either Xen or QEMU, and we would then need to > > > add back the XML attribute anyway) > > > > > > I'd be fine with doing it one of the other ways, but decided to post > > > this way first, since it's the one that I find the most aesthetically > > > pleasing :-) > > > > Okay this paragraph is a bit too much of discussion for a commit > > message. Please put stuff like this under the lines or as a RFC > > question. > > > Yeah, I meant to do that but forgot. > > > > > > > > > Note that a few test cases have been modified - places where an > > > existing "name='vfio'" was changed to "type='vfio'" were in in status > > > XML or only the *output* XML for a test (except the case of the > > > virnetworkportxml2xmltest, which doesn't have a separate directory for > > > the XML result; fortunately the converged parsing of <driver> between > > > domain/network/networkport means that the test cases for network and > > > domain XML are already testing the same code that would convert "name" > > > to "type" in thte networkport XML) > > > > > > Signed-off-by: Laine Stump <laine@xxxxxxxxxx> > > > --- > > > > The code looks okay but I'm not a fan of the rename for cosmetical > > reasons. Especially if we fiddle with the value in the field with > > original name and display it back to the user/management sw. > > I'll redo everything to use <driver model> and try to get it posted in the > next day or two. If you or anyone else thinks of a better name let me know. > _______________________________________________ Devel mailing list -- devel@xxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx