Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2023-12-06 at 10:30 -0600, Bjorn Helgaas wrote:
> [+cc Grant, Rajat, Rajat]
> 
> On Wed, Dec 06, 2023 at 10:18:56AM +0800, Kai-Heng Feng wrote:
> > On Wed, Nov 15, 2023 at 5:00 AM Nirmal Patel <
> > nirmal.patel@xxxxxxxxxxxxxxx> wrote:
> > > On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> > > > On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <
> > > > helgaas@xxxxxxxxxx> wrote:
> > ...
> > > > > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor
> > > > > ACPI _OSC on PCIe features").  That appeared in v5.17, and it
> > > > > fixed (or at least prevented) an AER message flood.  We can't
> > > > > simply revert 04b12ef163d1 unless we first prevent that AER
> > > > > message flood in another way.
> > > > 
> > > > The error is "correctable".  Does masking all correctable AER
> > > > error by default make any sense? And add a sysfs knob to make
> > > > it
> > > > optional.
> > > 
> > > I assume sysfs knob requires driver reload. right? Can you send a
> > > patch?
> > 
> > What I mean is to mask Correctable Errors by default on *all*
> > rootports, and create a new sysfs knob to let user decide if
> > Correctable Errors should be unmasked.
> 
> I don't think we should mask Correctable Errors by default.  Even
> though they've been corrected by hardware and no software action is
> required, I think these errors are valuable signals about Link
> integrity.
> 
> I think rate-limiting and/or reporting on the *frequency* of
> Correctable Errors would make a lot of sense.  We had some work
> toward
> this recently, but it hasn't quite gotten finished yet.
> 
> The most recent work I'm aware of is this:
> https://lore.kernel.org/r/20230606035442.2886343-1-grundler@xxxxxxxxxxxx

Hi Kai-Heng, Bjorn,

I believe the rate limit will not alone fix the issue rather will act
as a work around. Without 04b12ef163d1, the VMD driver is not aware of
OS native AER support setting, then we will see AER flooding issue
which is a bug in VMD driver since it will always enable the AER.

There is a setting in BIOS that allows us to enable OS native AER
support on the platform. This setting is located in EDK Menu ->
Platform configuration -> system event log -> IIO error enabling -> OS
native AER support. I have verified that the above BIOS setting alters
the native AER flag of _OSC. We can also verify it on Kai-Heng's
system.

I believe instead of going in the direction of rate limit, vmd driver
should honor OS native AER support setting.

Do you have any suggestion on this?

nirmal
> 
> Bjorn





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux