Re: [patch 036/212] mm, slab: make flush_slab() possible to call with irqs enabled

Vlastimil Babka <vbabka@xxxxxxx> · Fri, 3 Sep 2021 07:26:04 +0200

On 9/3/21 1:51 AM, Linus Torvalds wrote:
> [ Talking to myself while mulling this series over ... ]
> 
> On Thu, Sep 2, 2021 at 4:34 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Instead of having it lock/unlock halfway through the function (and
>> have magic "Oh, the caller already holds the lock, so don't lock"
>> semantics except with misleading names) I really think that function
>> should just have been split in two, and then the locked region can be
>> minimized in the caller only taking it for the first part.

Normally I would have done that, similarly to " [patch 033/212] mm,
slub: separate detaching of partial list in unfreeze_partials() from
unfreezing"

> If there's some reason why it can't sanely be split into two (too many
> common variables or some odd control flow or whatever), at least the
> locking logic should be changed from

I think what discouraged me was that the second part is to call
deactivate_slab() and to that we need to return two values
from the first part (page and freelist), so one of them has to be a
return parameter. On the other hand the second part does so little
it can be opencoded. See below.

> 
>       if (lock)
>                local_irq_save(flags);
> 
> to something along the lines of
> 
>       if (!caller_already_locked)
>                local_irq_save(flags);
> 
> so that when you read that function on its own, it's clear that the
> lock is always held over that critical section - and the issue is that
> perhaps the lock was already taken by the caller.

Actually that "already taken" becomes "caller does not need it/can't
even take the local lock as it's not local" (it's a cpu hot remove
handler on behalf of another, dead cpu).

So would it work with something like the following cleanup on top later
after proper testing? (now just compile tested).
---8<---

diff --git a/mm/slub.c b/mm/slub.c
index df1ac8aff86f..0d9e63e918f1 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2566,38 +2566,33 @@ static inline void unfreeze_partials_cpu(struct kmem_cache *s,
 
 #endif	/* CONFIG_SLUB_CPU_PARTIAL */
 
-static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c,
-			      bool lock)
+static inline struct page *
+__detach_cpu_slab(struct kmem_cache *s, struct kmem_cache_cpu *c,
+		  void **freelist)
 {
-	unsigned long flags;
-	void *freelist;
 	struct page *page;
 
-	if (lock)
-		local_lock_irqsave(&s->cpu_slab->lock, flags);
-
-	freelist = c->freelist;
 	page = c->page;
+	*freelist = c->freelist;
 
 	c->page = NULL;
 	c->freelist = NULL;
 	c->tid = next_tid(c->tid);
 
-	if (lock)
-		local_unlock_irqrestore(&s->cpu_slab->lock, flags);
-
-	if (page)
-		deactivate_slab(s, page, freelist);
-
-	stat(s, CPUSLAB_FLUSH);
+	return page;
 }
 
 static inline void __flush_cpu_slab(struct kmem_cache *s, int cpu)
 {
 	struct kmem_cache_cpu *c = per_cpu_ptr(s->cpu_slab, cpu);
+	struct page *page;
+	void *freelist;
 
-	if (c->page)
-		flush_slab(s, c, false);
+	if (c->page) {
+		page = __detach_cpu_slab(s, c, &freelist);
+		deactivate_slab(s, page, freelist);
+		stat(s, CPUSLAB_FLUSH);
+	}
 
 	unfreeze_partials_cpu(s, c);
 }
@@ -2618,14 +2613,24 @@ static void flush_cpu_slab(struct work_struct *w)
 	struct kmem_cache *s;
 	struct kmem_cache_cpu *c;
 	struct slub_flush_work *sfw;
+	struct page *page;
+	void *freelist;
+	unsigned long flags;
 
 	sfw = container_of(w, struct slub_flush_work, work);
 
 	s = sfw->s;
 	c = this_cpu_ptr(s->cpu_slab);
 
-	if (c->page)
-		flush_slab(s, c, true);
+	if (c->page) {
+		local_lock_irqsave(&s->cpu_slab->lock, flags);
+		page = __detach_cpu_slab(s, c, &freelist);
+		local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+
+		if (page)
+			deactivate_slab(s, page, freelist);
+		stat(s, CPUSLAB_FLUSH);
+	}
 
 	unfreeze_partials(s);
 }