Hi Vivek, On 9/2/21 11:48 AM, Vivek Kumar Gautam wrote: > Hi Alex, > > > On 9/2/21 3:29 PM, Alexandru Elisei wrote: >> Hi Vivek, >> >> On 8/10/21 7:25 AM, Vivek Gautam wrote: >>> Add support to parse extended configuration space for vfio based >>> assigned PCIe devices and add extended capabilities for the device >>> in the guest. This allows the guest to see and work on extended >>> capabilities, for example to toggle PRI extended cap to enable and >>> disable Shared virtual addressing (SVA) support. >>> PCIe extended capability header that is the first DWORD of all >>> extended caps is shown below - >>> >>> 31 20 19 16 15 0 >>> ____________________|_______|_____________________ >>> | Next cap off | 1h | Cap ID | >>> |____________________|_______|_____________________| >>> >>> Out of the two upper bytes of extended cap header, the >>> lower nibble is actually cap version - 0x1. >>> 'next cap offset' if present at bits [31:20], should be >>> right shifted by 4 bits to calculate the position of next >>> capability. >>> This change supports parsing and adding ATS, PRI and PASID caps. >>> >>> Signed-off-by: Vivek Gautam <vivek.gautam@xxxxxxx> >>> --- >>> include/kvm/pci.h | 6 +++ >>> vfio/pci.c | 104 ++++++++++++++++++++++++++++++++++++++++++---- >>> 2 files changed, 103 insertions(+), 7 deletions(-) [..] >>> @@ -725,7 +815,7 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev) >>> static int vfio_pci_parse_cfg_space(struct vfio_device *vdev) >>> { >>> - ssize_t sz = PCI_DEV_CFG_SIZE_LEGACY; >>> + ssize_t sz = PCI_DEV_CFG_SIZE; >>> struct vfio_region_info *info; >>> struct vfio_pci_device *pdev = &vdev->pci; >>> @@ -831,10 +921,10 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device >>> *vdev) >>> /* Install our fake Configuration Space */ >>> info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; >>> /* >>> - * We don't touch the extended configuration space, let's be cautious >>> - * and not overwrite it all with zeros, or bad things might happen. >>> + * Update the extended configuration space as well since we >>> + * are now populating the extended capabilities. >>> */ >>> - hdr_sz = PCI_DEV_CFG_SIZE_LEGACY; >>> + hdr_sz = PCI_DEV_CFG_SIZE; >> >> In one of the earlier versions of the PCI Express patches I was doing the same >> thing here - overwriting the entire PCI Express configuration space for a device. >> However, that made one of the devices I was using for testing stop working when >> assigned to a VM. >> >> I'll go through my testing notes and test it again, the cause of the failure might >> have been something else entirely which was fixed since then. > > Sure. Let me know your findings. I think I found the card that doesn't work when overwriting the extended device configuration space. I tried device assignment with a Realtek 8168 Gigabit Ethernet card on a Seattle machine, and the host freezes when I try to start a VM. Even after reset, the machine doesn't boot anymore and it gets stuck during the boot process at this message: NewPackageList status: EFI_SUCCESS BDS.SignalConnectDriversEvent(feeb6d60) BDS.ConnectRootBridgeHandles(feeb6db0) It doesn't go away no matter how many times I reset the machine, to get it booting again I have to pull the plug and plug it again. I tried assigning the device to a VM several times, and this happened every time. The card doesn't have the caps that you added, this is caused entirely by the config space write (tried it with only the config space change). It could be a problem kvmtool, with Linux or with the machine, but this is the only machine where device assignment works and I would like to keep it working with this NIC. One solution I see is to add a field to vfio_pci_device (something like has_pcie), and based on that, vfio_pci_fixup_cfg_space() could overwrite only the first 256 bytes or the entire device configuration space. It's also not clear to me what you are trying to achieve with this patch. Is there a particular device that you want to get working? Or an entire class of devices which have those features? If it's the former, you could have the size of the config space write depend on the vendor + device id. If it's the latter, we could key the size of the config space write based on the presence of those particular PCIE caps and try and fix other devices if they break. Will, Andre, do you see other solutions? Do you have a preference? Thanks, Alex > > Best regards > Vivek > >> >> Thanks, >> >> Alex >> >>> if (pwrite(vdev->fd, &pdev->hdr, hdr_sz, info->offset) != hdr_sz) { >>> vfio_dev_err(vdev, "failed to write %zd bytes to Config Space", >>> hdr_sz);