Hi Vivek, On 9/8/21 10:36 AM, Vivek Kumar Gautam wrote: > > > On 9/8/21 3:02 PM, Vivek Kumar Gautam wrote: >> Hi Alex, >> >> >> On 9/3/21 8:45 PM, Alexandru Elisei wrote: >>> Hi Vivek, >>> >>> On 9/2/21 11:48 AM, Vivek Kumar Gautam wrote: >>>> Hi Alex, >>>> >>>> >>>> On 9/2/21 3:29 PM, Alexandru Elisei wrote: >>>>> Hi Vivek, >>>>> >>>>> On 8/10/21 7:25 AM, Vivek Gautam wrote: >>>>>> Add support to parse extended configuration space for vfio based >>>>>> assigned PCIe devices and add extended capabilities for the device >>>>>> in the guest. This allows the guest to see and work on extended >>>>>> capabilities, for example to toggle PRI extended cap to enable and >>>>>> disable Shared virtual addressing (SVA) support. >>>>>> PCIe extended capability header that is the first DWORD of all >>>>>> extended caps is shown below - >>>>>> >>>>>> 31 20 19 16 15 0 >>>>>> ____________________|_______|_____________________ >>>>>> | Next cap off | 1h | Cap ID | >>>>>> |____________________|_______|_____________________| >>>>>> >>>>>> Out of the two upper bytes of extended cap header, the >>>>>> lower nibble is actually cap version - 0x1. >>>>>> 'next cap offset' if present at bits [31:20], should be >>>>>> right shifted by 4 bits to calculate the position of next >>>>>> capability. >>>>>> This change supports parsing and adding ATS, PRI and PASID caps. >>>>>> >>>>>> Signed-off-by: Vivek Gautam <vivek.gautam@xxxxxxx> >>>>>> --- >>>>>> include/kvm/pci.h | 6 +++ >>>>>> vfio/pci.c | 104 ++++++++++++++++++++++++++++++++++++++++++---- >>>>>> 2 files changed, 103 insertions(+), 7 deletions(-) >>> [..] >>>>>> @@ -725,7 +815,7 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev) >>>>>> static int vfio_pci_parse_cfg_space(struct vfio_device *vdev) >>>>>> { >>>>>> - ssize_t sz = PCI_DEV_CFG_SIZE_LEGACY; >>>>>> + ssize_t sz = PCI_DEV_CFG_SIZE; >>>>>> struct vfio_region_info *info; >>>>>> struct vfio_pci_device *pdev = &vdev->pci; >>>>>> @@ -831,10 +921,10 @@ static int vfio_pci_fixup_cfg_space(struct >>>>>> vfio_device >>>>>> *vdev) >>>>>> /* Install our fake Configuration Space */ >>>>>> info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; >>>>>> /* >>>>>> - * We don't touch the extended configuration space, let's be cautious >>>>>> - * and not overwrite it all with zeros, or bad things might happen. >>>>>> + * Update the extended configuration space as well since we >>>>>> + * are now populating the extended capabilities. >>>>>> */ >>>>>> - hdr_sz = PCI_DEV_CFG_SIZE_LEGACY; >>>>>> + hdr_sz = PCI_DEV_CFG_SIZE; >>>>> >>>>> In one of the earlier versions of the PCI Express patches I was doing the same >>>>> thing here - overwriting the entire PCI Express configuration space for a >>>>> device. >>>>> However, that made one of the devices I was using for testing stop working when >>>>> assigned to a VM. >>>>> >>>>> I'll go through my testing notes and test it again, the cause of the failure >>>>> might >>>>> have been something else entirely which was fixed since then. >>>> >>>> Sure. Let me know your findings. >>> >>> I think I found the card that doesn't work when overwriting the extended device >>> configuration space. I tried device assignment with a Realtek 8168 Gigabit >>> Ethernet card on a Seattle machine, and the host freezes when I try to start a >>> VM. >>> Even after reset, the machine doesn't boot anymore and it gets stuck during the >>> boot process at this message: >>> >>> NewPackageList status: EFI_SUCCESS >>> BDS.SignalConnectDriversEvent(feeb6d60) >>> BDS.ConnectRootBridgeHandles(feeb6db0) >>> >>> It doesn't go away no matter how many times I reset the machine, to get it >>> booting >>> again I have to pull the plug and plug it again. I tried assigning the device >>> to a >>> VM several times, and this happened every time. The card doesn't have the caps >>> that you added, this is caused entirely by the config space write (tried it with >>> only the config space change). >>> >>> It could be a problem kvmtool, with Linux or with the machine, but this is the >>> only machine where device assignment works and I would like to keep it working >>> with this NIC. >> >> Sorry for the delay in responding. I took sometime off work. >> Sure, we will try to keep your machine working :) >> >>> >>> One solution I see is to add a field to vfio_pci_device (something like >>> has_pcie), >>> and based on that, vfio_pci_fixup_cfg_space() could overwrite only the first 256 >>> bytes or the entire device configuration space. >> >> Does the card support PCI extended caps (as seen from the PCI spec v5.0 >> section-7.5)? >> If no, then I guess the check that I am planning to add - to check if the >> device supports extended Caps - can help here. Since we would add extended caps >> based on the mentioned check, it seems only valid to have that check before >> overwriting the configuration space. >> >>> >>> It's also not clear to me what you are trying to achieve with this patch. Is >>> there >>> a particular device that you want to get working? Or an entire class of devices >>> which have those features? If it's the former, you could have the size of the >>> config space write depend on the vendor + device id. If it's the latter, we could >>> key the size of the config space write based on the presence of those particular >>> PCIE caps and try and fix other devices if they break. > > Sorry, I missed adding here that I am adding support for Shared virtual > addressing that would need PCI devices' ATS and PRI extended capability to be > exposed to the guest. That's the reason I am adding this change to write the > device configuration space in kvmtool with ATS and PRI caps. Actually, you did, it's my fault for not parsing the commit message properly. Thanks for repeating it though. Thanks, Alex > > BRs > Vivek > >> >> Absolutely, we can check for the presence of PCI extended capabilities and >> based on that write the configuration space. If the device has issue with only >> a specific extended capability we can try to fix that by keying the >> DevID-VendorID pair? What do you think? >> >>> >>> Will, Andre, do you see other solutions? Do you have a preference? >> >> Will, Andre, please let me know as well if you have any preferences. >> >> Best regards >> Vivek >> >>> >>> Thanks, >>> >>> Alex >>> >>>> >>>> Best regards >>>> Vivek >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Alex >>>>> >>>>>> if (pwrite(vdev->fd, &pdev->hdr, hdr_sz, info->offset) != hdr_sz) { >>>>>> vfio_dev_err(vdev, "failed to write %zd bytes to Config Space", >>>>>> hdr_sz);