Hi Zhang, Sorry for my late response. On Tue, May 26, 2020 at 03:06:41PM +0800, Wetp Zhang wrote: > From: Zhang Yi <wetpzy@xxxxxxxxx> > > If a process don't need early-kill, it may not care the BUS_MCEERR_AO. > Let the process to be killed when it really access the corrupted memory. > > Signed-off-by: Zhang Yi <wetpzy@xxxxxxxxx> Thank you for pointing this. This looks to me a bug (per-process flag is ignored when system-wide flag is set). > --- > mm/memory-failure.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index a96364be8ab4..2db13d48865c 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -210,7 +210,7 @@ static int kill_proc(struct to_kill *tk, unsigned long pfn, int flags) > { > struct task_struct *t = tk->tsk; > short addr_lsb = tk->size_shift; > - int ret; > + int ret = 0; > > pr_err("Memory failure: %#lx: Sending SIGBUS to %s:%d due to hardware memory corruption\n", > pfn, t->comm, t->pid); > @@ -225,8 +225,9 @@ static int kill_proc(struct to_kill *tk, unsigned long pfn, int flags) > * This could cause a loop when the user sets SIGBUS > * to SIG_IGN, but hopefully no one will do that? > */ > - ret = send_sig_mceerr(BUS_MCEERR_AO, (void __user *)tk->addr, > - addr_lsb, t); /* synchronous? */ > + if ((t->flags & PF_MCE_PROCESS) && (t->flags & PF_MCE_EARLY)) > + ret = send_sig_mceerr(BUS_MCEERR_AO, > + (void __user *)tk->addr, addr_lsb, t); kill_proc() could be called only for processes that are selected by collect_procs() with task_early_kill(). So I think that we should fix task_early_kill(), maybe by reordering sysctl_memory_failure_early_kill check and find_early_kill_thread() check. static struct task_struct *task_early_kill(struct task_struct *tsk, int force_early) { struct task_struct *t; if (!tsk->mm) return NULL; if (force_early) return tsk; t = find_early_kill_thread(tsk); if (t) return t; if (sysctl_memory_failure_early_kill) return tsk; return NULL; } One subtleness is to make sure that find_early_kill_thread() should distinguish default value and explicitly set value, so we might need some modification on find_early_kill_thread(). Can you try that? Thanks, Naoya Horiguchi