Hi Alex, On 7/9/23 15:26, Alex Williamson wrote: > On Sun, 9 Jul 2023 13:50:35 +0200 > Eric Auger <eric.auger@xxxxxxxxxx> wrote: > >> Hi Alex, >> >> On 7/7/23 17:10, Alex Williamson wrote: >>> Unlike default access to config space through sysfs, the vpd read and >>> write function don't actively manage the runtime power management state >>> of the device during access. Since commit 7ab5e10eda02 ("vfio/pci: Move >>> the unused device into low power state with runtime PM"), the vfio-pci >>> driver will use runtime power management and release unused devices to >>> make use of low power states. Attempting to access VPD information in >>> this low power state can result in incorrect information or kernel >>> crashes depending on the system behavior. >>> >>> Wrap the vpd read/write bin attribute handlers in runtime PM and take >>> into account the potential quirk to select the correct device to wake. >> This much improved the situation as it is more difficult to hit the >> issue. Unfortunately after tens of attempts I was still able to hit a >> kernel crash. The console output does not mention the VPD anymore but >> PCI power management events (PME). > Does combining this with an extended D3hot wakeup for the device make > any difference, such as a5a6dd262469 ("PCI/PM: Extend D3hot delay for > NVIDIA HDA controllers")? Thanks, I added static void quirk_qlogic_nic_pm(struct pci_dev *dev) { dev_info(&dev->dev, "****** %s", __func__); quirk_d3hot_delay(dev, 20); } DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_QLOGIC, PCI_ANY_ID, PCI_CLASS_NETWORK_ETHERNET, 8, quirk_qlogic_nic_pm) and checked it was called, but I can still hit after 10's of iterations [ 152.788557] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf _tables nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs bridge stp llc rfkill vfat fat qedr ib_uverbs vfio_pci ib_core vfio_pci_core vfio_iommu_type1 crct10dif_c e ghash_ce vfio i2c_smbus sha1_ce ipmi_ssif ipmi_devintf nfsd ipmi_msghandler thunderx2_pmu auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sg ixgbevf ast drm_km s_helper qede i2c_algo_bit drm_shmem_helper qed drm ixgbe e1000e mpt3sas nvme nvme_core sha2_ce sha256_arm64 raid_class scsi_transport_sas nvme_common crc8 mdio gpio_xlp i2c_xl p9xx dm_mirror dm_region_hash dm_log dm_mod fuse [ 152.854119] CPU: 112 PID: 1416 Comm: kworker/112:7 Not tainted 6.4.0-vpd-upstream+ #57 [ 152.862024] Hardware name: GIGABYTE R181-T90-00/MT91-FS1-00, BIOS F34 08/13/2020 [ 152.869406] Workqueue: events_freezable pci_pme_list_scan [ 152.874810] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 152.881760] pc : pci_generic_config_read+0x64/0xd8 [ 152.886549] lr : pci_generic_config_read+0x28/0xd8 [ 152.891328] sp : ffff80008cb23cb0 [ 152.894630] x29: ffff80008cb23cb0 x28: 0000000000000000 x27: 0000000000000000 [ 152.901756] x26: 0000000000000000 x25: 0000000000000001 x24: ffff80008cb23d76 [ 152.908881] x23: ffff000806331800 x22: 0000000000000044 x21: ffff80008232d4c8 [ 152.916006] x20: ffff80008cb23d24 x19: 0000000000000002 x18: 0000000000000000 [ 152.923130] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000001 [ 152.930255] x14: ffffffffffffffff x13: 0000000000000000 x12: 0101010101010101 [ 152.937379] x11: 7f7f7f7f7f7f7f7f x10: fefefefefefefeff x9 : ffff8000807b3938 [ 152.944504] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : 000000000000007f [ 152.951628] x5 : 0000000000000000 x4 : ffff800090000000 x3 : 0000000000000000 [ 152.958752] x2 : 0000000000000044 x1 : 0000000000c01044 x0 : ffff800090c01044 [ 152.965878] Call trace: [ 152.968312] pci_generic_config_read+0x64/0xd8 [ 152.972745] pci_bus_read_config_word+0x84/0xe8 [ 152.977264] pci_read_config_word+0x30/0x50 [ 152.981435] pci_check_pme_status+0x70/0xa8 [ 152.985606] pci_pme_list_scan+0x84/0x158 [ 152.989604] process_one_work+0x1ec/0x488 [ 152.993616] worker_thread+0x48/0x400 [ 152.997267] kthread+0x10c/0x120 [ 153.000492] ret_from_fork+0x10/0x20 [ 153.004063] Code: 52800000 a94153f3 a8c27bfd d65f03c0 (79400000) [ 153.010146] ---[ end trace 0000000000000000 ]--- [ 153.014751] Kernel panic - not syncing: synchronous external abort: Fatal exception [ 153.022395] SMP: stopping secondary CPUs [ 154.081107] SMP: failed to stop secondary CPUs 112,194 [ 154.086236] Kernel Offset: 0x190000 from 0xffff800080000000 [ 154.091796] PHYS_OFFSET: 0x80000000 [ 154.095272] CPU features: 0x01800000,0a0140a1,08004203 [ 154.100398] Memory Limit: none [ 154.103440] ---[ end Kernel panic - not syncing: synchronous external abort: Fatal exception ]--- Eric > > Alex > > >> [ 168.616700] CPU: 0 PID: 1409 Comm: kworker/0:5 Not tainted >> 6.4.0-vpd-upstream+ #56 >> [ 168.624257] Hardware name: GIGABYTE R181-T90-00/MT91-FS1-00, BIOS F34 >> 08/13/2020 >> [ 168.631639] Workqueue: events_freezable pci_pme_list_scan >> [ 168.637032] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS >> BTYPE=--) >> [ 168.643982] pc : pci_generic_config_read+0x64/0xd8 >> [ 168.648763] lr : pci_generic_config_read+0x28/0xd8 >> [ 168.653542] sp : ffff80008caebcb0 >> [ 168.656844] x29: ffff80008caebcb0 x28: 0000000000000000 x27: >> 0000000000000000 >> [ 168.663969] x26: 0000000000000000 x25: 0000000000000000 x24: >> ffff80008caebd76 >> [ 168.671094] x23: ffff0008063fd800 x22: 0000000000000044 x21: >> ffff80008232d4c8 >> [ 168.678218] x20: ffff80008caebd24 x19: 0000000000000002 x18: >> 00000000000040fc >> [ 168.685342] x17: 00000000000040f8 x16: 0000000000000000 x15: >> 0000000000000001 >> [ 168.692466] x14: ffffffffffffffff x13: 0000000000000000 x12: >> 0101010101010101 >> [ 168.699590] x11: 7f7f7f7f7f7f7f7f x10: fefefefefefefeff x9 : >> ffff8000807b3938 >> [ 168.706714] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : >> 000000000000007f >> [ 168.713838] x5 : 0000000000000000 x4 : ffff800090000000 x3 : >> 0000000000000000 >> [ 168.720962] x2 : 0000000000000044 x1 : 0000000000c00044 x0 : >> ffff800090c00044 >> [ 168.728086] Call trace: >> [ 168.730520] pci_generic_config_read+0x64/0xd8 >> [ 168.734953] pci_bus_read_config_word+0x84/0xe8 >> [ 168.739471] pci_read_config_word+0x30/0x50 >> [ 168.743642] pci_check_pme_status+0x70/0xa8 >> [ 168.747813] pci_pme_list_scan+0x84/0x158 >> [ 168.751811] process_one_work+0x1ec/0x488 >> [ 168.755810] worker_thread+0x48/0x400 >> [ 168.759461] kthread+0x10c/0x120 >> [ 168.762678] ret_from_fork+0x10/0x20 >> [ 168.766245] Code: 52800000 a94153f3 a8c27bfd d65f03c0 (79400000) >> [ 168.772326] ---[ end trace 0000000000000000 ]--- >> [ 168.776931] Kernel panic - not syncing: synchronous external abort: >> Fatal exception >> [ 168.784574] SMP: stopping secondary CPUs >> [ 169.831001] SMP: failed to stop secondary CPUs 0,199 >> [ 169.835955] Kernel Offset: 0x190000 from 0xffff800080000000 >> >> >> Thanks >> >> Eric >>> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> >>> --- >>> drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++-- >>> 1 file changed, 32 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c >>> index a4fc4d0690fe..81217dd4789f 100644 >>> --- a/drivers/pci/vpd.c >>> +++ b/drivers/pci/vpd.c >>> @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj, >>> size_t count) >>> { >>> struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj)); >>> + struct pci_dev *vpd_dev = dev; >>> + ssize_t ret; >>> + >>> + if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) { >>> + vpd_dev = pci_get_func0_dev(dev); >>> + if (!vpd_dev) >>> + return -ENODEV; >>> + } >>> + >>> + pci_config_pm_runtime_get(vpd_dev); >>> + ret = pci_read_vpd(vpd_dev, off, count, buf); >>> + pci_config_pm_runtime_put(vpd_dev); >>> + >>> + if (dev != vpd_dev) >>> + pci_dev_put(vpd_dev); >>> >>> - return pci_read_vpd(dev, off, count, buf); >>> + return ret; >>> } >>> >>> static ssize_t vpd_write(struct file *filp, struct kobject *kobj, >>> @@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj, >>> size_t count) >>> { >>> struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj)); >>> + struct pci_dev *vpd_dev = dev; >>> + ssize_t ret; >>> + >>> + if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) { >>> + vpd_dev = pci_get_func0_dev(dev); >>> + if (!vpd_dev) >>> + return -ENODEV; >>> + } >>> + >>> + pci_config_pm_runtime_get(vpd_dev); >>> + ret = pci_write_vpd(vpd_dev, off, count, buf); >>> + pci_config_pm_runtime_put(vpd_dev); >>> + >>> + if (dev != vpd_dev) >>> + pci_dev_put(vpd_dev); >>> >>> - return pci_write_vpd(dev, off, count, buf); >>> + return ret; >>> } >>> static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0); >>>