Re: [PATCH v8 00/70] Introducing the Maple Tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Qian Cai <quic_qiancai@xxxxxxxxxxx> [220427 12:10]:
> On Tue, Apr 26, 2022 at 03:06:19PM +0000, Liam Howlett wrote:
> > Andrew,
> > 
> > Please replace the patches in your mglru-maple branch with this set.  It should
> > be a drop in replacement for my patch range with the fixes into these
> > patches.  Adding the preallocation to work around the fs-reclaim LOCKDEP
> > issue caused enough changes to the patches to warrant a respin.
> > 
> > The last patch on the branch is still needed to fix vmscan after mglru
> > is applied.  ee4b1fc24f30 "mm/vmscan: Use VMA_ITERATOR in
> > get_next_vma()"
> > 
> > 
> > Here is the pretty cover letter you requested last time.
> > 
> > ------------------------------------
> > 
> > The maple tree is an RCU-safe range based B-tree designed to use modern
> > processor cache efficiently.  There are a number of places in the kernel
> > that a non-overlapping range-based tree would be beneficial, especially
> > one with a simple interface.  The first user that is covered in this
> > patch set is the vm_area_struct, where three data structures are
> > replaced by the maple tree: the augmented rbtree, the vma cache, and the
> > linked list of VMAs in the mm_struct.  The long term goal is to reduce
> > or remove the mmap_sem contention.
> > 
> > The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
> > nodes.  With the increased branching factor, it is significantly shorter than
> > the rbtree so it has fewer cache misses.  The removal of the linked list
> > between subsequent entries also reduces the cache misses and the need to pull
> > in the previous and next VMA during many tree alterations.
> > 
> > This patch set is based on v5.18-rc2
> > 
> > git: https://github.com/oracle/linux-uek/tree/howlett/maple/20220426
> > 
> > v8 changes:
> >  - Added preallocations before any potential edits to the tree when holding the
> > i_mmap_lock to avoid fs-reclaim issues on extreme memory pressure.
> >  - Fixed issue in mempolicy mas_for_each() loop.
> >  - Moved static definitions inside ifdef for DEBUG_MAPLE
> >  - Fixed compile warnings reported by build bots
> >  - Moved mas_dfs_preorder() to testing code
> >  - Changed __vma_adjust() to record the highest vma in case 6 instead of
> > finding it twice.
> >  - Fixed locking issue in exit_mmap()
> >  - Fixed up from/s-o-b ordering
> 
> Running some syscall fuzzer would trigger a crash.
> 
>  BUG: KASAN: use-after-free in mas_find
>  ma_dead_node at lib/maple_tree.c:532
>  (inlined by) mas_next_entry at lib/maple_tree.c:4637
>  (inlined by) mas_find at lib/maple_tree.c:5869
>  Read of size 8 at addr ffff88811c5e9c00 by task trinity-c0/1351
> 
>  CPU: 5 PID: 1351 Comm: trinity-c0 Not tainted 5.18.0-rc4-next-20220427 #3
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-5.fc35 04/01/2014
>  Call Trace:
>   <TASK>
>   dump_stack_lvl
>   print_address_description.constprop.0.cold
>   print_report.cold
>   kasan_report
>   mas_find
>   apply_mlockall_flags


Thanks.  This is indeed an issue with 0d43186b36c1 (mm/mlock: use vma
iterator and instead of vma linked list)                                                 

Andrew, Please include this patch as a fix.

Thanks,
Liam
From 62c50b9683d10ccaa0b689459efaa41794db129b Mon Sep 17 00:00:00 2001
From: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
Date: Wed, 27 Apr 2022 12:46:04 -0400
Subject: [PATCH] mm/mlock:  Use maple state in apply_mlockall_flags()

The vma iterator is for simple cases.  Since mlock_fixup() can cause the
tree to change and thus requires the maple state to be reset,
apply_mlockall_flags() is not a simple case.  Use a maple state and
call mas_pause() instead.

Fixes: 0d43186b36c1 (mm/mlock: use vma iterator and instead of vma
linked list)
Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>
---
 mm/mlock.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index d8549b3dcb59..c41604ba5197 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -660,7 +660,7 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len)
  */
 static int apply_mlockall_flags(int flags)
 {
-	VMA_ITERATOR(vmi, current->mm, 0);
+	MA_STATE(mas, &current->mm->mm_mt, 0, 0);
 	struct vm_area_struct *vma, *prev = NULL;
 	vm_flags_t to_add = 0;
 
@@ -681,7 +681,7 @@ static int apply_mlockall_flags(int flags)
 			to_add |= VM_LOCKONFAULT;
 	}
 
-	for_each_vma(vmi, vma) {
+	mas_for_each(&mas, vma, ULONG_MAX) {
 		vm_flags_t newflags;
 
 		newflags = vma->vm_flags & VM_LOCKED_CLEAR_MASK;
@@ -689,6 +689,7 @@ static int apply_mlockall_flags(int flags)
 
 		/* Ignore errors */
 		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
+		mas_pause(&mas);
 		cond_resched();
 	}
 out:
-- 
2.35.1


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux