Re: [PATCH v2] kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 24-08-16 17:37:16, Michal Hocko wrote:
> On Wed 24-08-16 17:32:00, Oleg Nesterov wrote:
> > On 08/24, Michal Hocko wrote:
> > >
> > > Sounds better?
> > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > index b89f0eb99f0a..ddde5849df81 100644
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -914,7 +914,8 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm)
> > >  
> > >  	/*
> > >  	 * Signal userspace if we're not exiting with a core dump
> > > -	 * or a killed vfork parent which shouldn't touch this mm.
> > > +	 * because we want to leave the value intact for debugging
> > > +	 * purposes.
> > >  	 */
> > >  	if (tsk->clear_child_tid) {
> > >  		if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) &&
> > 
> > Yes, thanks Michal!
> > 
> > Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx>
> 
> OK, thanks.

ping

> ---
> From 39cad7842660e0261c27f75702d49458a1f3cea1 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@xxxxxxxx>
> Date: Mon, 30 May 2016 20:20:32 +0200
> Subject: [PATCH] kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
> 
> fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
> has caused a subtle regression in nscd which uses CLONE_CHILD_CLEARTID
> to clear the nscd_certainly_running flag in the shared databases, so
> that the clients are notified when nscd is restarted.  Now, when nscd
> uses a non-persistent database, clients that have it mapped keep
> thinking the database is being updated by nscd, when in fact nscd has
> created a new (anonymous) one (for non-persistent databases it uses an
> unlinked file as backend).
> 
> The original proposal for the CLONE_CHILD_CLEARTID change claimed
> (https://lkml.org/lkml/2006/10/25/233):
> "
> The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
> on behalf of pthread_create() library calls.  This feature is used to
> request that the kernel clear the thread-id in user space (at an address
> provided in the syscall) when the thread disassociates itself from the
> address space, which is done in mm_release().
> 
> Unfortunately, when a multi-threaded process incurs a core dump (such as
> from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
> the other threads, which then proceed to clear their user-space tids
> before synchronizing in exit_mm() with the start of core dumping.  This
> misrepresents the state of process's address space at the time of the
> SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
> problems (misleading him/her to conclude that the threads had gone away
> before the fault).
> 
> The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
> core dump has been initiated.
> "
> 
> The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
> seems to have a larger scope than the original patch asked for. It seems
> that limitting the scope of the check to core dumping should work for
> SIGSEGV issue describe above.
> 
> [Changelog partly based on Andreas' description]
> Fixes: fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
> Tested-by:  William Preston <wpreston@xxxxxxxx>
> Cc: Roland McGrath <roland@xxxxxxxxxxxxx>
> Cc: Andreas Schwab <schwab@xxxxxxxx>
> Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx>
> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> ---
>  kernel/fork.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 52e725d4a866..ddde5849df81 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -913,14 +913,12 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm)
>  	deactivate_mm(tsk, mm);
>  
>  	/*
> -	 * If we're exiting normally, clear a user-space tid field if
> -	 * requested.  We leave this alone when dying by signal, to leave
> -	 * the value intact in a core dump, and to save the unnecessary
> -	 * trouble, say, a killed vfork parent shouldn't touch this mm.
> -	 * Userland only wants this done for a sys_exit.
> +	 * Signal userspace if we're not exiting with a core dump
> +	 * because we want to leave the value intact for debugging
> +	 * purposes.
>  	 */
>  	if (tsk->clear_child_tid) {
> -		if (!(tsk->flags & PF_SIGNALED) &&
> +		if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) &&
>  		    atomic_read(&mm->mm_users) > 1) {
>  			/*
>  			 * We don't check the error code - if userspace has
> -- 
> 2.8.1
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]