Re: [PATCH v2] kernel: Be more careful about dup_mmap() failures and uprobe registering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Matthew Wilcox <willy@xxxxxxxxxxxxx> [250127 12:08]:
> On Mon, Jan 27, 2025 at 12:02:21PM -0500, Liam R. Howlett wrote:
> > From: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
> > 
> > In the even that there is a failure during dup_mmap(), the maple tree
> 
> "event".  Although the writing is a little clumsy here.  You could just
> say "If there is a failure".

Thanks.


> But that begs the question of what kind of
> failure, and I think it's only a memory allocation failure?  So you
> could say
> 
> "If a memory allocation fails during dup_mmap(),"

The problem that syzbot found is an allocation failure, but I didn't
want to say that's the only possible failure that could cause the
duplication to fail.  Looking at the code, we could also hit this issue
with a fatal signal.

If the return isn't zero, the mm will be marked as unstable.

If we fail in the loop across the vmas, then we also set the OOM skip
bit as the oom iterator may not be safe.


> 
> > can be left in an unsafe state for other iterators besides the exit
> > path.  All the locks are dropped before the exit_mmap() call (in
> > mm/mmap.c), but the incomplete mm_struct can be reached through (at
> > least) the rmap finding the vmas which have a pointer back to the
> > mm_struct.
> > 
> > Up to this point, there have been no issues with being able to find an
> > mm_struct that was only partially initialised.  Syzbot was able to make
> > the incomplete mm_struct fail with recent forking changes, so it has
> > been proven unsafe to use the mm_struct that hasn't been initialised, as
> > referenced in the link below.
> > 
> > Although 8ac662f5da19f ("fork: avoid inappropriate uprobe access to
> > invalid mm") fixed the uprobe access, it does not completely remove the
> > race.
> > 
> > This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on
> > the oom side (even though this is extremely unlikely to be selected as
> > an oom victim in the race window), and sets MMF_UNSTABLE to avoid other
> > potential users from using a partially initialised mm_struct.
> > 
> > When registering vmas for uprobe, skip the vmas in an mm that is marked
> > unstable.  Modifying a vma in an unstable mm may cause issues if the mm
> > isn't fully initialised.
> > 
> > Link: https://lore.kernel.org/all/6756d273.050a0220.2477f.003d.GAE@xxxxxxxxxx/
> > Fixes: d240629148377 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
> > Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> > Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > Cc: Jann Horn <jannh@xxxxxxxxxx>
> > Cc: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Michal Hocko <mhocko@xxxxxxxx>
> > Cc: Peng Zhang <zhangpeng.00@xxxxxxxxxxxxx>
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>
> > ---
> > 
> > v1: https://lore.kernel.org/all/20250123205849.793810-1-Liam.Howlett@xxxxxxxxxx/
> > 
> > Changes since:
> > v1
> >  - Added check_stable_address_space() to uprobe code - Thanks Lorenzo
> >  - Added Oleg & Masami to Cc list.
> > 
> >  kernel/events/uprobes.c |  4 ++++
> >  kernel/fork.c           | 17 ++++++++++++++---
> >  2 files changed, 18 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> > index fa04b14a7d723..90ebcdbad05ca 100644
> > --- a/kernel/events/uprobes.c
> > +++ b/kernel/events/uprobes.c
> > @@ -28,6 +28,7 @@
> >  #include <linux/rcupdate_trace.h>
> >  #include <linux/workqueue.h>
> >  #include <linux/srcu.h>
> > +#include <linux/oom.h>          /* check_stable_address_space */
> >  
> >  #include <linux/uprobes.h>
> >  
> > @@ -1260,6 +1261,9 @@ register_for_each_vma(struct uprobe *uprobe, struct uprobe_consumer *new)
> >  		 * returns NULL in find_active_uprobe_rcu().
> >  		 */
> >  		mmap_write_lock(mm);
> > +		if (check_stable_address_space(mm))
> > +			goto unlock;
> > +
> >  		vma = find_vma(mm, info->vaddr);
> >  		if (!vma || !valid_vma(vma, is_register) ||
> >  		    file_inode(vma->vm_file) != uprobe->inode)
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index ded49f18cd95c..20b2120f019ca 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -760,7 +760,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> >  		mt_set_in_rcu(vmi.mas.tree);
> >  		ksm_fork(mm, oldmm);
> >  		khugepaged_fork(mm, oldmm);
> > -	} else if (mpnt) {
> > +	} else {
> > +
> >  		/*
> >  		 * The entire maple tree has already been duplicated. If the
> >  		 * mmap duplication fails, mark the failure point with
> > @@ -768,8 +769,18 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> >  		 * stop releasing VMAs that have not been duplicated after this
> >  		 * point.
> >  		 */
> > -		mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
> > -		mas_store(&vmi.mas, XA_ZERO_ENTRY);
> > +		if (mpnt) {
> > +			mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
> > +			mas_store(&vmi.mas, XA_ZERO_ENTRY);
> > +			/* Avoid OOM iterating a broken tree */
> > +			set_bit(MMF_OOM_SKIP, &mm->flags);
> > +		}
> > +		/*
> > +		 * The mm_struct is going to exit, but the locks will be dropped
> > +		 * first.  Set the mm_struct as unstable is advisable as it is
> > +		 * not fully initialised.
> > +		 */
> > +		set_bit(MMF_UNSTABLE, &mm->flags);
> >  	}
> >  out:
> >  	mmap_write_unlock(mm);
> > -- 
> > 2.43.0
> > 
> > 
> > -- 
> > maple-tree mailing list
> > maple-tree@xxxxxxxxxxxxxxxxxxx
> > https://lists.infradead.org/mailman/listinfo/maple-tree
> 
> -- 
> maple-tree mailing list
> maple-tree@xxxxxxxxxxxxxxxxxxx
> https://lists.infradead.org/mailman/listinfo/maple-tree




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux