On 7/26/21 7:00 PM, Mike Galbraith wrote: > On Sun, 2021-07-25 at 21:34 +0200, Vlastimil Babka wrote: >> On 7/25/21 9:12 PM, Vlastimil Babka wrote: >>> + /* >>> + * On !RT we just want to disable preemption, on RT we need >>> the lock >>> + * for real. This happens to match local_lock() semantics. >>> + */ >>> + local_lock(&s->cpu_slab->lock); >> >> OK I realized (and tglx confirmed) that this will blow up on !RT + >> lockdep if interrupted by ___slab_alloc() that will do >> local_lock_irqsave(). So back to #ifdefs it is. But should work as-is >> for RT testing. > > Speaking of that local_lock_irqsave(), and some unloved ifdefs.. > > Why not do something like the below? When I look at new_slab:, I see > cpu_slab->partial assignment protected by IRQs being disabled, which > implies to me it should probably be so protected everywhere. There > used to be another slub_set_percpu_partial() call in > unfreeze_partials(), which was indeed called with IRQs disabled, quite > sane looking to an mm outsider looking in. The odd man out ->partial > assignment was the preempt disabled put_cpu_partial() cmpxchg loop, > which contained an IRQ disabled region to accommodate the > aforementioned unfreeze_partials(). > > Is there real world benefit to the cmpxchg loops whacked below (ala > monkey see monkey do) over everyone just taking the straight shot you > laid down for RT? It's easier on the eye (mine anyway), and neither > PREEMPT or PREEMPT_RT seem inclined to complain... tick... tock... Yep, sounds like a good approach, thanks. Percpu partial is not *the* SLUB fast path, so it should be sufficient without the lockless cmpxchg tricks. Will incorporate in updated series. > > --- > mm/slub.c | 79 ++++++++++++++++++++++++++++++++------------------------------ > 1 file changed, 41 insertions(+), 38 deletions(-) > > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2437,13 +2437,12 @@ static void __unfreeze_partials(struct k > static void unfreeze_partials(struct kmem_cache *s) > { > struct page *partial_page; > + unsigned long flags; > > - do { > - partial_page = this_cpu_read(s->cpu_slab->partial); > - > - } while (partial_page && > - this_cpu_cmpxchg(s->cpu_slab->partial, partial_page, NULL) > - != partial_page); > + local_lock_irqsave(&s->cpu_slab->lock, flags); > + partial_page = this_cpu_read(s->cpu_slab->partial); > + this_cpu_write(s->cpu_slab->partial, NULL); > + local_unlock_irqrestore(&s->cpu_slab->lock, flags); > > if (partial_page) > __unfreeze_partials(s, partial_page); > @@ -2480,41 +2479,45 @@ static void put_cpu_partial(struct kmem_ > { > #ifdef CONFIG_SLUB_CPU_PARTIAL > struct page *oldpage; > - int pages; > - int pobjects; > - > - slub_get_cpu_ptr(s->cpu_slab); > - do { > - pages = 0; > - pobjects = 0; > - oldpage = this_cpu_read(s->cpu_slab->partial); > - > - if (oldpage) { > - pobjects = oldpage->pobjects; > - pages = oldpage->pages; > - if (drain && pobjects > slub_cpu_partial(s)) { > - /* > - * partial array is full. Move the existing > - * set to the per node partial list. > - */ > - unfreeze_partials(s); > - oldpage = NULL; > - pobjects = 0; > - pages = 0; > - stat(s, CPU_PARTIAL_DRAIN); > - } > + struct page *page_to_unfreeze = NULL; > + unsigned long flags; > + int pages = 0, pobjects = 0; > + > + local_lock_irqsave(&s->cpu_slab->lock, flags); > + > + if (oldpage = this_cpu_read(s->cpu_slab->partial)) { > + pobjects = oldpage->pobjects; > + pages = oldpage->pages; > + if (drain && pobjects > slub_cpu_partial(s)) { > + /* > + * partial array is full. Move the existing > + * set to the per node partial list. > + * > + * Postpone unfreezing until we drop the local > + * lock to avoid an RT unlock/relock requirement > + * due to MEMCG __slab_free() recursion. > + */ > + page_to_unfreeze = oldpage; > + > + oldpage = NULL; > + pobjects = 0; > + pages = 0; > + stat(s, CPU_PARTIAL_DRAIN); > } > + } > + > + pages++; > + pobjects += page->objects - page->inuse; > + > + page->pages = pages; > + page->pobjects = pobjects; > + page->next = oldpage; > > - pages++; > - pobjects += page->objects - page->inuse; > + this_cpu_write(s->cpu_slab->partial, page); > + local_unlock_irqrestore(&s->cpu_slab->lock, flags); > > - page->pages = pages; > - page->pobjects = pobjects; > - page->next = oldpage; > - > - } while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) > - != oldpage); > - slub_put_cpu_ptr(s->cpu_slab); > + if (page_to_unfreeze) > + __unfreeze_partials(s, page_to_unfreeze); > #endif /* CONFIG_SLUB_CPU_PARTIAL */ > } > > >