On Sat, Jun 1, 2024 at 10:34 AM Baoquan He <bhe@xxxxxxxxxx> wrote: > > On 05/31/24 at 10:04am, Uladzislau Rezki wrote: > > On Fri, May 31, 2024 at 11:05:20AM +0800, zhaoyang.huang wrote: > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> > > > > > > vmalloc area runs out in our ARM64 system during an erofs test as > > > vm_map_ram failed[1]. By following the debug log, we find that > > > vm_map_ram()->vb_alloc() will allocate new vb->va which corresponding > > > to 4MB vmalloc area as list_for_each_entry_rcu returns immediately > > > when vbq->free->next points to vbq->free. That is to say, 65536 times > > > of page fault after the list's broken will run out of the whole > > > vmalloc area. This should be introduced by one vbq->free->next point to > > > vbq->free which makes list_for_each_entry_rcu can not iterate the list > > > and find the BUG. > > > > > > [1] > > > PID: 1 TASK: ffffff80802b4e00 CPU: 6 COMMAND: "init" > > > #0 [ffffffc08006afe0] __switch_to at ffffffc08111d5cc > > > #1 [ffffffc08006b040] __schedule at ffffffc08111dde0 > > > #2 [ffffffc08006b0a0] schedule at ffffffc08111e294 > > > #3 [ffffffc08006b0d0] schedule_preempt_disabled at ffffffc08111e3f0 > > > #4 [ffffffc08006b140] __mutex_lock at ffffffc08112068c > > > #5 [ffffffc08006b180] __mutex_lock_slowpath at ffffffc08111f8f8 > > > #6 [ffffffc08006b1a0] mutex_lock at ffffffc08111f834 > > > #7 [ffffffc08006b1d0] reclaim_and_purge_vmap_areas at ffffffc0803ebc3c > > > #8 [ffffffc08006b290] alloc_vmap_area at ffffffc0803e83fc > > > #9 [ffffffc08006b300] vm_map_ram at ffffffc0803e78c0 > > > > > > Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks") > > > > > > Suggested-by: Hailong.Liu <hailong.liu@xxxxxxxx> > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> > > > > > Is a problem related to run out of vmalloc space _only_ or it is a problem > > with broken list? From the commit message it is hard to follow the reason. > > This should fix the broken list. > > Hi Zhaoyang and Hailong, > > Could any of you test below patch in your testing environment? > > From b56dcc7d98c4dbb7ea197516bd129c30c0e9d1ef Mon Sep 17 00:00:00 2001 > From: Baoquan He <bhe@xxxxxxxxxx> > Date: Fri, 31 May 2024 23:44:57 +0800 > Subject: [PATCH] mm/vmalloc.c: add vb into appropriate vbq->free > Content-type: text/plain > > The current vbq is organized into per-cpu data structure, including a xa > and list. However, its adding into vba->free list is not handled > correctly. The new vmap_block allocation could be done in one cpu, while > it's actually belong into anohter cpu's percpu vbq. Then the > list_for_each_entry_rcu() on the vbq->free and its deletion could cause > list breakage. > > This fix the wrong vb adding to make it be added into expected > vba->free. > > Signed-off-by: Baoquan He <bhe@xxxxxxxxxx> > --- > mm/vmalloc.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index b921baf0ef8a..47659b41259a 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2547,6 +2547,14 @@ addr_to_vb_xa(unsigned long addr) > return &per_cpu(vmap_block_queue, index).vmap_blocks; > } > > +static struct vmap_block_queue * > +addr_to_vbq(unsigned long addr) > +{ > + int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus(); > + > + return &per_cpu(vmap_block_queue, index); > +} emm, I am wondering if it make sense to add addr to vbp[CPU1] from CPU0 etc which is against per_cpu variable's goal? > + > /* > * We should probably have a fallback mechanism to allocate virtual memory > * out of partially filled vmap blocks. However vmap block sizing should be > @@ -2626,7 +2634,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) > return ERR_PTR(err); > } > > - vbq = raw_cpu_ptr(&vmap_block_queue); > + vbq = addr_to_vbq(va->va_start); > spin_lock(&vbq->lock); > list_add_tail_rcu(&vb->free_list, &vbq->free); > spin_unlock(&vbq->lock); > -- > 2.41.0 >