On Wed 24-08-16 17:37:16, Michal Hocko wrote: > On Wed 24-08-16 17:32:00, Oleg Nesterov wrote: > > On 08/24, Michal Hocko wrote: > > > > > > Sounds better? > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > index b89f0eb99f0a..ddde5849df81 100644 > > > --- a/kernel/fork.c > > > +++ b/kernel/fork.c > > > @@ -914,7 +914,8 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm) > > > > > > /* > > > * Signal userspace if we're not exiting with a core dump > > > - * or a killed vfork parent which shouldn't touch this mm. > > > + * because we want to leave the value intact for debugging > > > + * purposes. > > > */ > > > if (tsk->clear_child_tid) { > > > if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) && > > > > Yes, thanks Michal! > > > > Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx> > > OK, thanks. ping > --- > From 39cad7842660e0261c27f75702d49458a1f3cea1 Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@xxxxxxxx> > Date: Mon, 30 May 2016 20:20:32 +0200 > Subject: [PATCH] kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd > > fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit") > has caused a subtle regression in nscd which uses CLONE_CHILD_CLEARTID > to clear the nscd_certainly_running flag in the shared databases, so > that the clients are notified when nscd is restarted. Now, when nscd > uses a non-persistent database, clients that have it mapped keep > thinking the database is being updated by nscd, when in fact nscd has > created a new (anonymous) one (for non-persistent databases it uses an > unlinked file as backend). > > The original proposal for the CLONE_CHILD_CLEARTID change claimed > (https://lkml.org/lkml/2006/10/25/233): > " > The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls > on behalf of pthread_create() library calls. This feature is used to > request that the kernel clear the thread-id in user space (at an address > provided in the syscall) when the thread disassociates itself from the > address space, which is done in mm_release(). > > Unfortunately, when a multi-threaded process incurs a core dump (such as > from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of > the other threads, which then proceed to clear their user-space tids > before synchronizing in exit_mm() with the start of core dumping. This > misrepresents the state of process's address space at the time of the > SIGSEGV and makes it more difficult for someone to debug NPTL and glibc > problems (misleading him/her to conclude that the threads had gone away > before the fault). > > The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a > core dump has been initiated. > " > > The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269) > seems to have a larger scope than the original patch asked for. It seems > that limitting the scope of the check to core dumping should work for > SIGSEGV issue describe above. > > [Changelog partly based on Andreas' description] > Fixes: fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit") > Tested-by: William Preston <wpreston@xxxxxxxx> > Cc: Roland McGrath <roland@xxxxxxxxxxxxx> > Cc: Andreas Schwab <schwab@xxxxxxxx> > Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > --- > kernel/fork.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 52e725d4a866..ddde5849df81 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -913,14 +913,12 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm) > deactivate_mm(tsk, mm); > > /* > - * If we're exiting normally, clear a user-space tid field if > - * requested. We leave this alone when dying by signal, to leave > - * the value intact in a core dump, and to save the unnecessary > - * trouble, say, a killed vfork parent shouldn't touch this mm. > - * Userland only wants this done for a sys_exit. > + * Signal userspace if we're not exiting with a core dump > + * because we want to leave the value intact for debugging > + * purposes. > */ > if (tsk->clear_child_tid) { > - if (!(tsk->flags & PF_SIGNALED) && > + if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) && > atomic_read(&mm->mm_users) > 1) { > /* > * We don't check the error code - if userspace has > -- > 2.8.1 > > -- > Michal Hocko > SUSE Labs -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>