On 7/21/21 11:33 AM, Mike Galbraith wrote: > On Wed, 2021-07-21 at 10:44 +0200, Vlastimil Babka wrote: >> >> So this doesn't look like our put_cpu_partial() preempted a >> __slab_alloc() on the same cpu, right? > > No, likely it was the one preempted by someone long gone, but we'll > never know without setting a trap. > >> BTW did my ugly patch work? > > Nope. I guess you missed my reporting it to have been a -ENOBOOT, and Indeed, I misunderstood it as you talking about your patch. > that cutting it in half, ie snagging only __slab_free() does boot, and > seems to cure all of the RT fireworks. OK, so depending on drain=1 makes this apply only to put_cpu_partial() called from __slab_free and not get_partial_node(). One notable difference is that in __slab_free we don't have n->list_lock locked and in get_partial_node() we do. I guess in case your list_lock is made raw again by another patch, that explains a local_lock can't nest under it. If not, then I would expect this to work (I don't think they ever nest in the opposite order, also lockdep should tell us instead of -ENOBOOT?), but might be missing something... I'd rather not nest those locks in any case. I just need to convince myself that the scenario the half-fix fixes is indeed the only one that's needed and we're not leaving there other races that are just harder to trigger... > (chainsaw noises...) > > --- > mm/slub.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2551,6 +2551,8 @@ static void put_cpu_partial(struct kmem_ > int pobjects; > > slub_get_cpu_ptr(s->cpu_slab); > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && drain) > + local_lock(&s->cpu_slab->lock); > do { > pages = 0; > pobjects = 0; > @@ -2564,7 +2566,13 @@ static void put_cpu_partial(struct kmem_ > * partial array is full. Move the existing > * set to the per node partial list. > */ > - unfreeze_partials(s); > + this_cpu_write(s->cpu_slab->partial, NULL); > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && drain) > + local_unlock(&s->cpu_slab->lock); > + __unfreeze_partials(s, oldpage); > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && drain) > + local_lock(&s->cpu_slab->lock); > + > oldpage = NULL; > pobjects = 0; > pages = 0; > @@ -2581,6 +2589,8 @@ static void put_cpu_partial(struct kmem_ > > } while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) > != oldpage); > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && drain) > + local_unlock(&s->cpu_slab->lock); > slub_put_cpu_ptr(s->cpu_slab); > #endif /* CONFIG_SLUB_CPU_PARTIAL */ > } > >