On 8/17/21 11:12 AM, Sebastian Andrzej Siewior wrote: > On 2021-08-17 10:37:48 [+0200], Vlastimil Babka wrote: >> OK reproduced. Thanks, will investigate. > > With the local_lock at the top, the needed alignment gets broken for dbl > cmpxchg. On RT it was working ;) I'd rather have page and partial in the same cacheline as well, is it ok to just move the lock as last and not care about whether it straddles cachelines or not? (with CONFIG_SLUB_CPU_PARTIAL it will naturally start with the next cacheline). > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index b5bcac29b979c..cd14aa1f9bc3c 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -42,9 +42,9 @@ enum stat_item { > NR_SLUB_STAT_ITEMS }; > > struct kmem_cache_cpu { > - local_lock_t lock; /* Protects the fields below except stat */ > void **freelist; /* Pointer to next available object */ > unsigned long tid; /* Globally unique transaction id */ > + local_lock_t lock; /* Protects the fields below except stat */ > struct page *page; /* The slab from which we are allocating */ > #ifdef CONFIG_SLUB_CPU_PARTIAL > struct page *partial; /* Partially allocated frozen slabs */ > > Sebastian >