Re: PROBLEM: kernel crashes when running xfsdump since ~6.4

Uladzislau Rezki <urezki@xxxxxxxxx> · Mon, 24 Jun 2024 14:16:54 +0200

On Fri, Jun 21, 2024 at 10:02:50PM +0800, Baoquan He wrote:
> On 06/21/24 at 11:44am, Uladzislau Rezki wrote:
> > On Fri, Jun 21, 2024 at 03:07:16PM +0800, Baoquan He wrote:
> > > On 06/21/24 at 11:30am, Hailong Liu wrote:
> > > > On Thu, 20. Jun 14:02, Nick Bowler wrote:
> > > > > On 2024-06-20 02:19, Nick Bowler wrote:
> ......
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index be2dd281ea76..18e87cafbaf2 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -2542,7 +2542,7 @@ static DEFINE_PER_CPU(struct vmap_block_queue, vmap_block_queue);
> > >  static struct xarray *
> > >  addr_to_vb_xa(unsigned long addr)
> > >  {
> > > -	int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus();
> > > +	int index = (addr / VMAP_BLOCK_SIZE) % nr_cpu_ids;
> > >  
> > >  	return &per_cpu(vmap_block_queue, index).vmap_blocks;
> > >  }
> > > 
> > The problem i see is about not-initializing of the:
> > <snip>
> > 	for_each_possible_cpu(i) {
> > 		struct vmap_block_queue *vbq;
> > 		struct vfree_deferred *p;
> > 
> > 		vbq = &per_cpu(vmap_block_queue, i);
> > 		spin_lock_init(&vbq->lock);
> > 		INIT_LIST_HEAD(&vbq->free);
> > 		p = &per_cpu(vfree_deferred, i);
> > 		init_llist_head(&p->list);
> > 		INIT_WORK(&p->wq, delayed_vfree_work);
> > 		xa_init(&vbq->vmap_blocks);
> > 	}
> > <snip>
> > 
> > correctly or fully. It is my bad i did not think that CPUs in a possible mask
> > can be non sequential :-/
> > 
> > nr_cpu_ids - is not the max possible CPU. For example, in Nick case,
> > when he has two CPUs, num_possible_cpus() and nr_cpu_ids are the same.
> 
> I checked the generic version of setup_nr_cpu_ids(), from codes, they
> are different with my understanding.
> 
> kernel/smp.c
> void __init setup_nr_cpu_ids(void)
> {
>         set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1);
> }
> 
I see that it is not a weak function, so it is generic, thus the
behavior can not be overwritten, which is great. This does what we
need.

Thank you for checking this you are right!

Then it is just a matter of proper initialization of the hash:

<snip>

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 5d3aa2dc88a8..1733946f7a12 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -5087,7 +5087,13 @@ void __init vmalloc_init(void)
         */
        vmap_area_cachep = KMEM_CACHE(vmap_area, SLAB_PANIC);
 
-       for_each_possible_cpu(i) {
+       /*
+        * We use "nr_cpu_ids" here because some architectures
+        * may have "gaps" in cpu-possible-mask. It is OK for
+        * per-cpu approaches but is not OK for cases where it
+        * can be used as hashes also.
+        */
+       for (i = 0; i < nr_cpu_ids; i++) {
                struct vmap_block_queue *vbq;
                struct vfree_deferred *p;
<snip>

--
Uladzislau Rezki