On Mon, 2017-08-14 at 22:39 +0200, Borislav Petkov wrote: > On Mon, Aug 14, 2017 at 08:17:54PM +0000, Kani, Toshimitsu wrote: > > I think the current code design of allocating mci & ghes_edac_pvt > > for each GHES source entry makes sense. > > And I don't. > > > edac_raw_mc_handle_error() also has the same expectation that the > > call is serialized per mci. > > There's no such thing as "per mci" if the driver scans *all DIMMs* > per register call. If it does it this way, then it is only one mci. ghes_edac instantiates an mci as a pseudo device representing a GHES error source. Each error source associates with all DIMMs, and may report errors independently. As ghes_edac is an GHES error-reporting wrapper to edac, this abstraction makes sense. > It is actually wrong right now because if you register more than one > mci and you do edac_inc_ce_error()/edac_inc_ue_error(), potentially > different counters get incremented for the same errors. Exactly > because each instance registered is *wrongly* responsible for all > DIMMs on the system. I do not see a problem in having counters for each GHES error source. This is just statistics info, and ghes_edac does not expect any OS action from the counters. > So you either need to partition the DIMMs per mci (which I can't > imagine how it would work) or introduce locking when incrementing the > mci->counters. I do not think changing the calling convention to edac library interfaces is a good idea for a special case like ghes_edac. Such changes can be a burden for us going forward. I think ghes_edac just needs to work with the current prerequisite. User apps like ras-mc-ctl works as expected for a given (not-so-great) DIMM info from SMBIOS as well. I do not see a probelm from user perspective, either. Thanks, -Toshi ��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f