On Wed, 2017-07-19 at 07:52 +0200, Borislav Petkov wrote: > On Tue, Jul 18, 2017 at 09:20:44PM +0000, Kani, Toshimitsu wrote: > > I agree that 'osc_sb_apei_support_acked' should be checked when > > enabling ghes_edac. I do not know the details of existing issues, > > but it sounds unlikely that this will address all of them since > > bugs can be everywhere. > > No, see below. > > > For instance, ghes_edac relies on DMI/SMBIOS info, unlike > > other EDAC drivers, which can be buggy regardless of this _OSC > > info. > > That's the problem with firmware. You can't really fix it and it is > buggy as hell. Right, and that's what I was told as an issue for ghes_edac. This is why this patch introduces a white-list to preclude all buggy firmwares that are unknown to us... > > I agree that making ghes_edac as a normal module is a good thing, > > but I do not think it's going to solve this issue. > > Of course it will - if the firmware says it wants to look at the > errors first, then it gets to do so. This is the whole handling of > hardware errors in the firmware deal. I admit, sometimes it makes > sense because the firmware has the most intimate knowledge of the > platform and, in a perfect world, we won't ever need to have > platform-specific EDAC drivers. > > But, we don't live in a perfect world. And the vendor execution of > the whole firmware-error-handling deal is an abomination at best. > > So, if we realize that the firmware is buggy, we can use a platform > list to blacklist it (^hint hint^) and have a parameter to disable > ghes_edac from loading. Setting blacklist needs us to enable ghes_edac and find all buggy firmwares to date. I think this is too disturbing for people who are happily using regular edac drivers today even though their platforms have GHES. > But we'll deal with that when we get to cross that bridge. Right now, > I'd like to do the loading spec-conform and not fiddle with white-, > black-, or any-other-color lists. I do prefer to avoid any white / black listing. But I do not see how it solves the buggy DMI/SMBIOS info as an example of firmware bugs we may have to deal with. Thanks, -Toshi ��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f