On Thu, 21 Jan 2021 11:03:05 +0100 Oscar Salvador <osalvador@xxxxxxx> wrote: > On Wed, Jan 20, 2021 at 04:24:22PM +0800, Aili Yao wrote: > > When a memory uncorrected error is triggered by process who accessed > > the address with error, It's Action Required Case for only current > > process which triggered this; This Action Required case means Action > > optional to other process who share the same page. Usually killing > > current process will be sufficient, other processes sharing the same > > page will get be signaled when they really touch the poisoned page. > > > > But there is another scenario that other processes > > sharing the same page want to be signaled early with PF_MCE_EARLY set, > > In this case, we should get them into kill list and signal > > BUS_MCEERR_AO to them. > > > > So in this patch, task_early_kill will check current process if > > force_early is set, and if not current,the code will fallback to > > find_early_kill_thread() to check if there is PF_MCE_EARLY process > > who cares the error. > > > > In kill_proc(), BUS_MCEERR_AR is only send to current, other processes in > > kill list will be signaled with BUS_MCEERR_AO. > > > > Acked-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > > Signed-off-by: Aili Yao <yaoaili@xxxxxxxxxxxx> > > Looks good to me, a few nits below. > > Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> > > > > @@ -243,9 +243,12 @@ static int kill_proc(struct to_kill *tk, unsigned long pfn, int flags) > > pfn, t->comm, t->pid); > > > > if (flags & MF_ACTION_REQUIRED) { > > - WARN_ON_ONCE(t != current); > > - ret = force_sig_mceerr(BUS_MCEERR_AR, > > + if (tk->tsk == current) > You can re-use "t" here. yeah, this look better, I will change to that. Thanks! > > > + ret = force_sig_mceerr(BUS_MCEERR_AR, > > (void __user *)tk->addr, addr_lsb); > > + else > > + ret = send_sig_mceerr(BUS_MCEERR_AO, (void __user *)tk->addr, > > + addr_lsb, t); > > I would place a brief comment above explaining why we are sending BUS_MCEER_AO > to non-current tasks. > E.g: "Signal other processes sharing the page if they have PF_MCE_EARLY set" Yes, it's good to have > > > @@ -457,8 +463,6 @@ static struct task_struct *task_early_kill(struct task_struct *tsk, > > */ > > if (tsk->mm == current->mm) > > return current; > > - else > > - return NULL; > > if (force_early && task->mm == current->mm) > return current; > > This looks more cleaner. I will modify above in a v6 patch! Thanks, -- Best Regards! Aili Yao