Re: [PATCHv3] mm: fix incorrect vbq reference in purge_fragmented_block

Baoquan He <bhe@xxxxxxxxxx> · Sat, 1 Jun 2024 10:34:15 +0800

On 05/31/24 at 10:04am, Uladzislau Rezki wrote:
> On Fri, May 31, 2024 at 11:05:20AM +0800, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > 
> > vmalloc area runs out in our ARM64 system during an erofs test as
> > vm_map_ram failed[1]. By following the debug log, we find that
> > vm_map_ram()->vb_alloc() will allocate new vb->va which corresponding
> > to 4MB vmalloc area as list_for_each_entry_rcu returns immediately
> > when vbq->free->next points to vbq->free. That is to say, 65536 times
> > of page fault after the list's broken will run out of the whole
> > vmalloc area. This should be introduced by one vbq->free->next point to
> > vbq->free which makes list_for_each_entry_rcu can not iterate the list
> > and find the BUG.
> > 
> > [1]
> > PID: 1        TASK: ffffff80802b4e00  CPU: 6    COMMAND: "init"
> >  #0 [ffffffc08006afe0] __switch_to at ffffffc08111d5cc
> >  #1 [ffffffc08006b040] __schedule at ffffffc08111dde0
> >  #2 [ffffffc08006b0a0] schedule at ffffffc08111e294
> >  #3 [ffffffc08006b0d0] schedule_preempt_disabled at ffffffc08111e3f0
> >  #4 [ffffffc08006b140] __mutex_lock at ffffffc08112068c
> >  #5 [ffffffc08006b180] __mutex_lock_slowpath at ffffffc08111f8f8
> >  #6 [ffffffc08006b1a0] mutex_lock at ffffffc08111f834
> >  #7 [ffffffc08006b1d0] reclaim_and_purge_vmap_areas at ffffffc0803ebc3c
> >  #8 [ffffffc08006b290] alloc_vmap_area at ffffffc0803e83fc
> >  #9 [ffffffc08006b300] vm_map_ram at ffffffc0803e78c0
> > 
> > Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks")
> > 
> > Suggested-by: Hailong.Liu <hailong.liu@xxxxxxxx>
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> >
> Is a problem related to run out of vmalloc space _only_ or it is a problem
> with broken list? From the commit message it is hard to follow the reason.

This should fix the broken list.

Hi Zhaoyang and Hailong,

Could any of you test below patch in your testing environment?

>From b56dcc7d98c4dbb7ea197516bd129c30c0e9d1ef Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@xxxxxxxxxx>
Date: Fri, 31 May 2024 23:44:57 +0800
Subject: [PATCH] mm/vmalloc.c: add vb into appropriate vbq->free
Content-type: text/plain

The current vbq is organized into per-cpu data structure, including a xa
and list. However, its adding into vba->free list is not handled
correctly. The new vmap_block allocation could be done in one cpu, while
it's actually belong into anohter cpu's percpu vbq. Then the
list_for_each_entry_rcu() on the vbq->free and its deletion could cause
list breakage.

This fix the wrong vb adding to make it be added into expected
vba->free.

Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
---
 mm/vmalloc.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b921baf0ef8a..47659b41259a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2547,6 +2547,14 @@ addr_to_vb_xa(unsigned long addr)
 	return &per_cpu(vmap_block_queue, index).vmap_blocks;
 }
 
+static struct vmap_block_queue *
+addr_to_vbq(unsigned long addr)
+{
+	int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus();
+
+	return &per_cpu(vmap_block_queue, index);
+}
+
 /*
  * We should probably have a fallback mechanism to allocate virtual memory
  * out of partially filled vmap blocks. However vmap block sizing should be
@@ -2626,7 +2634,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
 		return ERR_PTR(err);
 	}
 
-	vbq = raw_cpu_ptr(&vmap_block_queue);
+	vbq = addr_to_vbq(va->va_start);
 	spin_lock(&vbq->lock);
 	list_add_tail_rcu(&vb->free_list, &vbq->free);
 	spin_unlock(&vbq->lock);
-- 
2.41.0