On Mon, Jun 08, 2020 at 03:17:59PM -0700, Luck, Tony wrote: > On Fri, Jun 05, 2020 at 10:37:19AM +0900, Naoya Horiguchi wrote: > > Action Required memory error should happen only when a processor is > > about to access to a corrupted memory, so it's synchronous and only > > affects current process/thread. Recently commit 872e9a205c84 ("mm, > > memory_failure: don't send BUS_MCEERR_AO for action required error") > > fixed the issue that Action Required memory could unnecessarily send > > SIGBUS to the processes which share the error memory. But we still have > > another issue that we could send SIGBUS to a wrong thread. > > > > This is because collect_procs() and task_early_kill() fails to add the > > current process to "to-kill" list. So this patch is suggesting to fix > > it. With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current > > process/thread. > > Does the new code now send SIGBUS(BUS_MCEERR_AO) to all the other threads > of a multi-threaded process? No, it doesn't. This patch should not change anything for Action Optional case, and find_early_kill_thread() chooses one thread per process, so SIGBUS(BUS_MCEERR_AO) (as well as SIGBUS(BUS_MCEERR_AR)) should be sent only to the chosen thread. - Naoya > > It looks like it might (and I don't have some handy multi-threaded test > case to try it out). > > If it does, is that what we want? > > -Tony >