Re: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2017-07-18 at 08:00 +0200, Borislav Petkov wrote:
> On Mon, Jul 17, 2017 at 03:59:12PM -0600, Toshi Kani wrote:
> > The ghes_edac driver was introduced in 2013 [1], but it has not
> > been enabled by any distro yet.  This driver obtains error info
> > from firmware interfaces, which are not properly implemented on
> > many platforms, as the driver always emits the messages below:
> > 
> >  This EDAC driver relies on BIOS to enumerate memory and get error
> > reports.  Unfortunately, not all BIOSes reflect the memory layout
> > correctly  So, the end result of using this driver varies from
> > vendor to vendor  If you find incorrect reports, please contact
> > your hardware vendor  to correct its BIOS.
> > 
> > To get out from this situation, add a platform type check to
> > selectively enable the driver on the platforms that are known to
> > have proper firmware implementation.  Platform vendors can add
> > their platforms to the list when they support ghes_edac.
> 
> So maintaining whitelists for things has always been a PITA and we
> should try to avoid it, if possible. (We can always do it if nothing
> saner comes along.)

Agreed.

> Now, below is a dirty patch converting ghes_edac to a normal module.
> On systems where we have GHES, the firmware generally disables the
> detection of the presence of ECC hardware, thus preventing the
> platform EDAC driver from loading.

I have HPE Haswell and Skylake test systems with GHES, but they do not
hide IMCs from the OS.  So, the sb_edac and skx_edac drivers get
attached on these systems when ghes_edac is disabled.

> Let me clarify: I have an AMD HP box which, when GHES is enabled in
> the BIOS, says that ECC is disabled in the memory controller and the
> amd64_edac driver doesn't load for that memory controller.

Hmm... what's the platform name of this box?  I can look into this case
if you need.

> And I think we should try this first: have the firmware disable
> detection methods so that the platform drivers don't load.

I do not think we can rely on this method.

> Then, ghes_edac can be a simple module and no other driver would
> attempt loading.

I like the use of notifier chain, which is much cleaner.

> The question is: does the platform do this disabling now?

Unfortunately, that is not the case today.  The IMCs cannot be hidden
with the Device Hide registers for Skylake at least.

Thanks,
-Toshi

��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux