On Thu, May 02, 2024 at 03:38:00PM -0700, Paul M Stillwell Jr wrote: > On 5/2/2024 3:08 PM, Bjorn Helgaas wrote: > > On Mon, Apr 08, 2024 at 11:39:27AM -0700, Paul M Stillwell Jr wrote: > > > Commit 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") added > > > code to copy the _OSC flags from the root bridge to the host bridge for each > > > vmd device because the AER bits were not being set correctly which was > > > causing an AER interrupt storm for certain NVMe devices. > > > > > > This works fine in bare metal environments, but causes problems when the > > > vmd driver is run in a hypervisor environment. In a hypervisor all the > > > _OSC bits are 0 despite what the underlying hardware indicates. This is > > > a problem for vmd users because if vmd is enabled the user *always* > > > wants hotplug support enabled. To solve this issue the vmd driver always > > > enables the hotplug bits in the host bridge structure for each vmd. > > > > > > Fixes: 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") > > > Signed-off-by: Nirmal Patel <nirmal.patel@xxxxxxxxxxxxxxx> > > > Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@xxxxxxxxx> > > > --- > > > drivers/pci/controller/vmd.c | 10 ++++++++-- > > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c > > > index 87b7856f375a..583b10bd5eb7 100644 > > > --- a/drivers/pci/controller/vmd.c > > > +++ b/drivers/pci/controller/vmd.c > > > @@ -730,8 +730,14 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd) > > > static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge, > > > struct pci_host_bridge *vmd_bridge) > > > { > > > - vmd_bridge->native_pcie_hotplug = root_bridge->native_pcie_hotplug; > > > - vmd_bridge->native_shpc_hotplug = root_bridge->native_shpc_hotplug; > > > + /* > > > + * there is an issue when the vmd driver is running within a hypervisor > > > + * because all of the _OSC bits are 0 in that case. this disables > > > + * hotplug support, but users who enable VMD in their BIOS always want > > > + * hotplug suuport so always enable it. > > > + */ > > > + vmd_bridge->native_pcie_hotplug = 1; > > > + vmd_bridge->native_shpc_hotplug = 1; > > > > Deferred for now because I think we need to figure out how to set all > > these bits the same, or at least with a better algorithm than "here's > > what we want in this environment." > > > > Extended discussion about this at > > https://lore.kernel.org/r/20240417201542.102-1-paul.m.stillwell.jr@xxxxxxxxx > > That's ok by me. I thought where we left it was that if we could find a > solution to the Correctable Errors from the original issue that maybe we > could revert 04b12ef163d1. > > I'm not sure I would know if a patch that fixes the Correctable Errors comes > in... We have a test case we would like to test against that was pre > 04b12ef163d1 (BIOS has AER disabled and we hotplug a disk which results in > AER interrupts) so we would be curious if the issues we saw before goes away > with a new patch for Correctable Errors. My current theory is that there's some issue with that particular Samsung NVMe device that causes the Correctable Error flood. Kai-Heng says they happen even with VMD disabled. And there are other reports that don't seem to involve VMD but do involve this NVMe device ([144d:a80a]): https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1852420 https://forums.unraid.net/topic/113521-constant-errors-on-logs-after-nvme-upgrade/ https://forums.unraid.net/topic/118286-nvme-drives-throwing-errors-filling-logs-instantly-how-to-resolve/ https://forum.proxmox.com/threads/pve-kernel-panics-on-reboots.144481/ https://www.eevblog.com/forum/general-computing/linux-mint-21-02-clone-replace-1tb-nvme-with-a-2tb-nvme/ NVMe has weird power management stuff, so it's always possible we're doing something wrong in a driver. But I think we really need to handle Correctable Errors better by: - Possibly having drivers mask errors if they know about defects - Making the log messages less alarming, e.g., a single line report - Rate-limiting them so they're never overwhelming - Maybe automatically masking them in the PCI core to avoid interrupts > > > vmd_bridge->native_aer = root_bridge->native_aer; > > > vmd_bridge->native_pme = root_bridge->native_pme; > > > vmd_bridge->native_ltr = root_bridge->native_ltr; > > > -- > > > 2.39.1 > > > > > >