> From: Yishai Hadas <yishaih@xxxxxxxxxx> > Sent: Wednesday, December 13, 2023 8:25 PM > > On 13/12/2023 10:23, Tian, Kevin wrote: > >> From: Yishai Hadas <yishaih@xxxxxxxxxx> > >> Sent: Thursday, December 7, 2023 6:28 PM > >> > >> + > >> +static ssize_t virtiovf_pci_read_config(struct vfio_device *core_vdev, > >> + char __user *buf, size_t count, > >> + loff_t *ppos) > >> +{ > >> + struct virtiovf_pci_core_device *virtvdev = container_of( > >> + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); > >> + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; > >> + size_t register_offset; > >> + loff_t copy_offset; > >> + size_t copy_count; > >> + __le32 val32; > >> + __le16 val16; > >> + u8 val8; > >> + int ret; > >> + > >> + ret = vfio_pci_core_read(core_vdev, buf, count, ppos); > >> + if (ret < 0) > >> + return ret; > >> + > >> + if (range_intersect_range(pos, count, PCI_DEVICE_ID, sizeof(val16), > >> + ©_offset, ©_count, > >> ®ister_offset)) { > >> + val16 = cpu_to_le16(VIRTIO_TRANS_ID_NET); > >> + if (copy_to_user(buf + copy_offset, (void *)&val16 + > >> register_offset, copy_count)) > >> + return -EFAULT; > >> + } > >> + > >> + if ((le16_to_cpu(virtvdev->pci_cmd) & PCI_COMMAND_IO) && > >> + range_intersect_range(pos, count, PCI_COMMAND, sizeof(val16), > >> + ©_offset, ©_count, > >> ®ister_offset)) { > >> + if (copy_from_user((void *)&val16 + register_offset, buf + > >> copy_offset, > >> + copy_count)) > >> + return -EFAULT; > >> + val16 |= cpu_to_le16(PCI_COMMAND_IO); > >> + if (copy_to_user(buf + copy_offset, (void *)&val16 + > >> register_offset, > >> + copy_count)) > >> + return -EFAULT; > >> + } > > > > the write handler calls vfio_pci_core_write() for PCI_COMMAND so > > the core vconfig should have the latest copy of the IO bit value which > > is copied to the user buffer by vfio_pci_core_read(). then not necessary > > to update it again. > > You assume the the 'vconfig' mechanism/flow is always applicable for > that specific field, this should be double-checked. > However, as for now the driver doesn't rely / use the vconfig for other > fields as it doesn't match and need a big refactor, I prefer to not rely > on it at all and have it here. iiuc this driver does relies on vconfig for other fields. It first calls vfio_pci_core_read() and then modifies selected fields which needs special tweak in this driver. btw what virtio-net specific tweak is applied to PCI_COMMAND? You basically cache the cmd value and then set PCI_COMMAND_IO if it's set in the cached value. But this is already covered by pci vconfig. If anything is broken there then we already have a big trouble. The trick for bar0 makes sense as it doesn't exist physically then the vconfig mechanism may not give the expected result. But I didn't see the same rationale for PCI_COMMAND. > >> + > >> +static ssize_t > >> +virtiovf_pci_core_write(struct vfio_device *core_vdev, const char __user > >> *buf, > >> + size_t count, loff_t *ppos) > >> +{ > >> + struct virtiovf_pci_core_device *virtvdev = container_of( > >> + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); > >> + unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos); > >> + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; > >> + > >> + if (!count) > >> + return 0; > >> + > >> + if (index == VFIO_PCI_CONFIG_REGION_INDEX) { > >> + size_t register_offset; > >> + loff_t copy_offset; > >> + size_t copy_count; > >> + > >> + if (range_intersect_range(pos, count, PCI_COMMAND, > >> sizeof(virtvdev->pci_cmd), > >> + ©_offset, ©_count, > >> + ®ister_offset)) { > >> + if (copy_from_user((void *)&virtvdev->pci_cmd + > >> register_offset, > >> + buf + copy_offset, > >> + copy_count)) > >> + return -EFAULT; > >> + } > >> + > >> + if (range_intersect_range(pos, count, PCI_BASE_ADDRESS_0, > >> + sizeof(virtvdev->pci_base_addr_0), > >> + ©_offset, ©_count, > >> + ®ister_offset)) { > >> + if (copy_from_user((void *)&virtvdev- > >>> pci_base_addr_0 + register_offset, > >> + buf + copy_offset, > >> + copy_count)) > >> + return -EFAULT; > >> + } > >> + } > > > > wrap above into virtiovf_pci_write_config() to be symmetric with > > the read path. > > Upon the read path, we do the full flow and return to the user. Here we > just save some data and continue to call the core, so I'm not sure that > this worth a dedicated function. I don't see a real difference. for the read path you first call vfio_pci_core_read() then modifies some fields. for the write path you save some data then call vfio_pci_core_write(). > > However, this can be done, do you still suggest it for V8 ? yes > >> +static int virtiovf_pci_init_device(struct vfio_device *core_vdev) > >> +{ > >> + struct virtiovf_pci_core_device *virtvdev = container_of( > >> + core_vdev, struct virtiovf_pci_core_device, core_device.vdev); > >> + struct pci_dev *pdev; > >> + int ret; > >> + > >> + ret = vfio_pci_core_init_dev(core_vdev); > >> + if (ret) > >> + return ret; > >> + > >> + pdev = virtvdev->core_device.pdev; > >> + ret = virtiovf_read_notify_info(virtvdev); > >> + if (ret) > >> + return ret; > >> + > >> + /* Being ready with a buffer that supports MSIX */ > >> + virtvdev->bar0_virtual_buf_size = VIRTIO_PCI_CONFIG_OFF(true) + > >> + virtiovf_get_device_config_size(pdev- > >>> device); > > > > which code is relevant to MSIX? > > The buffer size must include the MSIX part to match the virtio-net > specification. > > As part of virtiovf_issue_legacy_rw_cmd() we may use the full buffer > upon read/write. at least mention that MSIX is in the device config region otherwise it's not helpful to people w/o virtio background. > >> + > >> +static const struct vfio_device_ops virtiovf_vfio_pci_ops = { > >> + .name = "virtio-vfio-pci", > >> + .init = vfio_pci_core_init_dev, > >> + .release = vfio_pci_core_release_dev, > >> + .open_device = virtiovf_pci_open_device, > > > > could be vfio_pci_core_open_device(). Given virtiovf specific init func > > is not called virtiovf_pci_open_device() is essentially same as the > > core func. > > We don't have today vfio_pci_core_open_device() as an exported function. > > The virtiovf_pci_open_device() matches both cases so I don't see a real > reason to export it now. > > By the way, it follows other drivers in vfio, see for example here [1]. > > [1] > https://elixir.bootlin.com/linux/v6.7- > rc5/source/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c#L1383 ah, yes. > >> + > >> +static int virtiovf_pci_probe(struct pci_dev *pdev, > >> + const struct pci_device_id *id) > >> +{ > >> + const struct vfio_device_ops *ops = &virtiovf_vfio_pci_ops; > >> + struct virtiovf_pci_core_device *virtvdev; > >> + int ret; > >> + > >> + if (pdev->is_virtfn && virtio_pci_admin_has_legacy_io(pdev) && > >> + !virtiovf_bar0_exists(pdev)) > >> + ops = &virtiovf_vfio_pci_tran_ops; > > > > I have a confusion here. > > > > why do we want to allow this driver binding to non-matching VF or > > even PF? > > The intention is to allow the binding of any virtio-net device (i.e. PF, > VF which is not transitional capable) to have a single driver over VFIO > for all virtio-net devices. > > This enables any user space application to bind and use any virtio-net > device without the need to care. > > In case the device is not transitional capable, it will simply use the > generic vfio functionality. Is it useful to print a message here? > > > > > if that is the intention then the naming/description should be adjusted > > to not specific to vf throughout this patch. > > > > e.g. don't use "virtiovf_" prefix... > > The main functionality is to supply the transitional device to user > space for the VF, hence the prefix and the description for that driver > refers to VF. > > Let's stay with that. ok > > > > > the config option is generic: > > > > +config VIRTIO_VFIO_PCI > > + tristate "VFIO support for VIRTIO NET PCI devices" > > > > but the description is specific to vf: > > > > + This provides support for exposing VIRTIO NET VF devices which > support > > + legacy IO access, using the VFIO framework that can work with a > legacy > > + virtio driver in the guest. > > > > then the module description is generic again: > > > > +MODULE_DESCRIPTION( > > + "VIRTIO VFIO PCI - User Level meta-driver for VIRTIO NET devices"); > > > > Yes, as the binding allows that, it looks fine to me. > what about revising the kconfig message to make it clear that it's for all virtio-net device with special trick to make VF as a transitional device?