On Tue, May 20, 2014 at 09:28:00AM -0700, Luck, Tony wrote: > When a thread in a multi-threaded application hits a machine > check because of an uncorrectable error in memory - we want to > send the SIGBUS with si.si_code = BUS_MCEERR_AR to that thread. > Currently we fail to do that if the active thread is not the > primary thread in the process. collect_procs() just finds primary > threads and this test: > if ((flags & MF_ACTION_REQUIRED) && t == current) { > will see that the thread we found isn't the current thread > and so send a si.si_code = BUS_MCEERR_AO to the primary > (and nothing to the active thread at this time). > > We can fix this by checking whether "current" shares the same > mm with the process that collect_procs() said owned the page. > If so, we send the SIGBUS to current (with code BUS_MCEERR_AR). > > Reported-by: Otto Bruggeman <otto.g.bruggeman@xxxxxxxxx> > Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx> > --- > mm/memory-failure.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 35ef28acf137..642c8434b166 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -204,9 +204,9 @@ static int kill_proc(struct task_struct *t, unsigned long addr, int trapno, > #endif > si.si_addr_lsb = compound_order(compound_head(page)) + PAGE_SHIFT; > > - if ((flags & MF_ACTION_REQUIRED) && t == current) { > + if ((flags & MF_ACTION_REQUIRED) && t->mm == current->mm) { > si.si_code = BUS_MCEERR_AR; > - ret = force_sig_info(SIGBUS, &si, t); > + ret = force_sig_info(SIGBUS, &si, current); > } else { > /* > * Don't use force here, it's convenient if the signal > -- > 1.8.4.1 Very interesting. I remembered there was a thread about AO error. Here is the link: http://www.spinics.net/lists/linux-mm/msg66653.html. According to this link, I have two concerns: 1) how to handle the similar scenario like it in this link. I mean once the main thread doesn't handle AR error but a thread does this, if SIGBUS can't be handled at once. 2) why that patch isn't merged. From that thread, Naoya should mean "acknowledge" :-).
Attachment:
signature.asc
Description: Digital signature