Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

Jason Gunthorpe <jgg@xxxxxxxxxx> · Wed, 18 Oct 2023 13:33:33 -0300

On Tue, Oct 17, 2023 at 02:24:48PM -0600, Alex Williamson wrote:
> On Tue, 17 Oct 2023 16:42:17 +0300
> Yishai Hadas <yishaih@xxxxxxxxxx> wrote:
> > +static int virtiovf_pci_probe(struct pci_dev *pdev,
> > +			      const struct pci_device_id *id)
> > +{
> > +	const struct vfio_device_ops *ops = &virtiovf_acc_vfio_pci_ops;
> > +	struct virtiovf_pci_core_device *virtvdev;
> > +	int ret;
> > +
> > +	if (pdev->is_virtfn && virtiovf_support_legacy_access(pdev) &&
> > +	    !virtiovf_bar0_exists(pdev) && pdev->msix_cap)
> > +		ops = &virtiovf_acc_vfio_pci_tran_ops;
> 
> This is still an issue for me, it's a very narrow use case where we
> have a modern device and want to enable legacy support.  Implementing an
> IO BAR and mangling the device ID seems like it should be an opt-in,
> not standard behavior for any compatible device.  Users should
> generally expect that the device they see in the host is the device
> they see in the guest.  They might even rely on that principle.

I think this should be configured when the VF is provisioned. If the
user does not want legacy IO bar support then the VFIO VF function
should not advertise the capability, and they won't get driver
support.

I think that is a very reasonable way to approach this - it is how we
approached similar problems for mlx5. The provisioning interface is
what "profiles" the VF, regardless of if VFIO is driving it or not.

> We can't use the argument that users wanting the default device should
> use vfio-pci rather than virtio-vfio-pci because we've already defined
> the algorithm by which libvirt should choose a variant driver for a
> device.  libvirt will choose this driver for all virtio-net devices.

Well, we can if the use case is niche. I think profiling a virtio VF
to support legacy IO bar emulation and then not wanting to use it is
a niche case.

The same argument is going come with live migration. This same driver
will still bind and enable live migration if the virtio function is
profiled to support it. If you don't want that in your system then
don't profile the VF for migration support.

> This driver effectively has the option to expose two different profiles
> for the device, native or transitional.  We've discussed profile
> support for variant drivers previously as an equivalent functionality
> to mdev types, but the only use case for this currently is out-of-tree.
> I think this might be the opportunity to define how device profiles are
> exposed and selected in a variant driver.

Honestly, I've been trying to keep this out of VFIO...

The function is profiled when it is created, by whatever created
it. As in the other thread we have a vast amount of variation in what
is required to provision the function in the first place. "Legacy IO
BAR emulation support" is just one thing. virtio-net needs to be
hooked up to real network and get a MAC, virtio-blk needs to be hooked
up to real storage and get a media. At a minimum. This is big and
complicated.

It may not even be the x86 running VFIO that is doing this
provisioning, the PCI function may come pre-provisioned from a DPU.

It feels better to keep that all in one place, in whatever external
thing is preparing the function before giving it to VFIO. VFIO is
concerned with operating a prepared function.

When we get to SIOV it should not be VFIO that is
provisioning/creating functions. The owning driver should be doing
this and routing the function to VFIO (eg with an aux device or
otherwise)

This gets back to the qemu thread on the grace patch where we need to
ask how does the libvirt world see this, given there is no good way to
generically handle all scenarios without a userspace driver to operate
elements.

> Jason had previously suggested a devlink interface for this, but I
> understand that path had been shot down by devlink developers.  

I think we go some things support but supporting all things was shot
down.

> Another obvious option is sysfs, where we might imagine an optional
> "profiles" directory, perhaps under vfio-dev.  Attributes of
> "available" and "current" could allow discovery and selection of a
> profile similar to mdev types.

IMHO it is a far too complex problem for sysfs.

Jason