On Monday, August 8, 2022 12:38 PM, Borislav Petkov wrote: > On Mon, Aug 08, 2022 at 08:17:58PM +0200, Rafael J. Wysocki wrote: > > This effectively makes EDAC depend on GHES which may not be always > > valid AFAICS. > > Yes, and this has been getting on my nerves since forever. > > The GHES code which does collect all those errors *forces* the registration of > an EDAC module which does only the reporting. > > Which cannot be any more backwards. > > What should happen is, GHES inits and starts working on the errors. > Then, at some point later, ghes_edac loads and starts reporting whatever it > gets. If there's no EDAC module, it doesn't report them. The same way MCA > works. > > That's it. > > And then ghes_edac can be made a normal module again and we can get rid > of this insanity. The following approach may be considerable: - Separate ghes_edac_register() into two functions, e.g., ghes_edac_register() and ghes_edac_init(). - ghes_edac_register() only takes the first if-block with IS_ENABLED() & force_load check, and then calls a new function, edac_set_owner(mod_name), which simply sets mod_name to edac_mc_owner. This allows ghes_edac_register() to run before edac_init(), and sets edac_mc_owner to prevent chipset-specific edac driver to be loaded before ghes_edac. - ghes_edac_init() first calls edac_get_owner() to match with its mod_name. If so, it performs the rest of the original ghes_edac_register() procedure. This ghes_edac_init() is called from the normal module init path, e.g., module_init(). Thanks, Toshi