Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nirmal,

On Tue, Dec 12, 2023 at 7:13 AM Nirmal Patel
<nirmal.patel@xxxxxxxxxxxxxxx> wrote:
>
> On Wed, 2023-12-06 at 10:30 -0600, Bjorn Helgaas wrote:
> > [+cc Grant, Rajat, Rajat]
> >
> > On Wed, Dec 06, 2023 at 10:18:56AM +0800, Kai-Heng Feng wrote:
> > > On Wed, Nov 15, 2023 at 5:00 AM Nirmal Patel <
> > > nirmal.patel@xxxxxxxxxxxxxxx> wrote:
> > > > On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> > > > > On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <
> > > > > helgaas@xxxxxxxxxx> wrote:
> > > ...
> > > > > > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor
> > > > > > ACPI _OSC on PCIe features").  That appeared in v5.17, and it
> > > > > > fixed (or at least prevented) an AER message flood.  We can't
> > > > > > simply revert 04b12ef163d1 unless we first prevent that AER
> > > > > > message flood in another way.
> > > > >
> > > > > The error is "correctable".  Does masking all correctable AER
> > > > > error by default make any sense? And add a sysfs knob to make
> > > > > it
> > > > > optional.
> > > >
> > > > I assume sysfs knob requires driver reload. right? Can you send a
> > > > patch?
> > >
> > > What I mean is to mask Correctable Errors by default on *all*
> > > rootports, and create a new sysfs knob to let user decide if
> > > Correctable Errors should be unmasked.
> >
> > I don't think we should mask Correctable Errors by default.  Even
> > though they've been corrected by hardware and no software action is
> > required, I think these errors are valuable signals about Link
> > integrity.
> >
> > I think rate-limiting and/or reporting on the *frequency* of
> > Correctable Errors would make a lot of sense.  We had some work
> > toward
> > this recently, but it hasn't quite gotten finished yet.
> >
> > The most recent work I'm aware of is this:
> > https://lore.kernel.org/r/20230606035442.2886343-1-grundler@xxxxxxxxxxxx
>
> Hi Kai-Heng, Bjorn,
>
> I believe the rate limit will not alone fix the issue rather will act
> as a work around. Without 04b12ef163d1, the VMD driver is not aware of
> OS native AER support setting, then we will see AER flooding issue
> which is a bug in VMD driver since it will always enable the AER.

Agree. Rate limiting doesn't stop the AER interrupt, so it won't flood
the kernel message but will still hog CPU time.

Kai-Heng

> There is a setting in BIOS that allows us to enable OS native AER
> support on the platform. This setting is located in EDK Menu ->
> Platform configuration -> system event log -> IIO error enabling -> OS
> native AER support. I have verified that the above BIOS setting alters
> the native AER flag of _OSC. We can also verify it on Kai-Heng's
> system.
>
> I believe instead of going in the direction of rate limit, vmd driver
> should honor OS native AER support setting.
>
> Do you have any suggestion on this?
>
> nirmal
> >
> > Bjorn
>





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux