On Wed, 29 Jul 2020, Zhang, Qiang wrote: > > From: Zhang Qiang <qiang.zhang@xxxxxxxxxxxxx> > > > > We should add node spinlock protect "n->alien" which may be > > assigned to NULL in cpuup_canceled func. cause address access > > exception. > > > > >Hi, do you have an example NULL pointer dereference where you have hit > >this? > If you have a NULL pointer dereference or a GPF that occurred because of this, it would be helpful to provide as rationale. > >This rather looks like something to fix up in cpuup_canceled() since it's > >currently manipulating the alien cache for the canceled cpu's node. > > yes , it is fix up in cpuup_canceled it's > currently manipulating the alien cache for the canceled cpu's node which may be the same as the node being operated on in the __cache_free_alien func. > > void cpuup_canceled > { > n = get_node(cachep, node); > spin_lock_irq(&n->list_lock); > ... > n->alien = NULL; > spin_unlock_irq(&n->list_lock); > .... > } > Right, so the idea is that this should be fixed in cpuup_canceled() instead -- why would we invaliate the entire node's alien cache because a single cpu failed to come online?