On Mon, Aug 22, 2022 at 03:40:42PM +0000, Jia He wrote: > Commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in > apci_init()") introduced a bug that ghes_edac_register() would be invoked > before edac_init(). Because at that time, the bus "edac" hadn't been even > registered, this created sysfs /devices/mc0 instead of > /sys/devices/system/edac/mc/mc0 on an Ampere eMag server. > > To remove the dependency of ghes_edac on ghes, make it a proper module. Use > a list to save the probing devices in ghes_probe(), and defer the > ghes_edac_register() to module_init() of the new ghes_edac module by > iterating over the devices list. > > Co-developed-by: Borislav Petkov <bp@xxxxxxxxx> > Signed-off-by: Borislav Petkov <bp@xxxxxxxxx> > Signed-off-by: Jia He <justin.he@xxxxxxx> > Fixes: dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()") > Cc: stable@xxxxxxxxxx Why is this marked for stable? The prerequisite patches are needed too. I guess this needs to be communicated to stable folks somehow by doing Cc: stable@xxxxxxxxxx # needs commits X, Y, ... but I guess the committer needs to do that because only at commit time will X and Y be known... So, is there any particular reason why this should be in stable? > @@ -1442,7 +1449,9 @@ static int ghes_remove(struct platform_device *ghes_dev) > > ghes_fini(ghes); > > - ghes_edac_unregister(ghes); > + mutex_lock(&ghes_devs_mutex); > + list_del_rcu(&ghes->elist); Is that list RCU-protected? > + mutex_unlock(&ghes_devs_mutex); > > kfree(ghes); ... > @@ -566,3 +549,35 @@ void ghes_edac_unregister(struct ghes *ghes) > unlock: > mutex_unlock(&ghes_reg_mutex); > } > + > +static int __init ghes_edac_init(void) > +{ > + struct ghes *g, *g_tmp; > + > + if (!IS_ENABLED(CONFIG_X86)) > + force_load = true; No, this is not how this works. > + ghes_devs = ghes_get_devices(force_load); > + if (!ghes_devs) > + return -ENODEV; You simply need to check force_load here. > + list_for_each_entry_safe(g, g_tmp, ghes_devs, elist) { > + ghes_edac_register(g->dev); > + } > + > + return 0; > +} -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette