On 6/14/21 1:16 PM, Sebastian Andrzej Siewior wrote: > On 2021-06-14 13:07:14 [+0200], Vlastimil Babka wrote: >> > +#ifdef CONFIG_PREEMPT_RT >> > +#define slub_get_cpu_ptr(var) get_cpu_ptr(var) >> > +#define slub_put_cpu_ptr(var) put_cpu_ptr(var) >> >> After Mel's report and bisect pointing to this patch, I realized I got the >> #ifdef wrong and it should be #ifnded > > So if you got the ifdef wrong (and kept everything as-is) then you > tested the RT version on !RT. migrate_disable() behaves on !RT as on RT. > As per changelog you don't use migrate_disable() unconditionally because > it increases the overhead on !RT. Correct. > I haven't looked at the series and I have just this tiny question: why > did migrate_disable() crash for Mel on !RT and why do you expect that it > does not happen on PREEMPT_RT? Right, so it's because __slab_alloc() has this optimization to avoid re-reading 'c' in case there is no preemption enabled at all (or it's just voluntary). #ifdef CONFIG_PREEMPTION /* * We may have been preempted and rescheduled on a different * cpu before disabling preemption. Need to reload cpu area * pointer. */ c = slub_get_cpu_ptr(s->cpu_slab); #endif Mel's config has CONFIG_PREEMPT_VOLUNTARY, which means CONFIG_PREEMPTION is not enabled. But then later in ___slab_alloc() we have slub_put_cpu_ptr(s->cpu_slab); page = new_slab(s, gfpflags, node); c = slub_get_cpu_ptr(s->cpu_slab); And this is not hidden under CONFIG_PREEMPTION, so with the #ifdef bug the slub_put_cpu_ptr did a migrate_enable() with Mel's config, without prior migrate_disable(). If there wasn't the #ifdef PREEMPT_RT bug: - this slub_put_cpu_ptr() would translate to put_cpu_ptr() thus preempt_enable(), which on this config is just a barrier(), so it doesn't matter that there was no matching preempt_disable() before. - with PREEMPT_RT the CONFIG_PREEMPTION would be enabled, so the slub_get_cpu_ptr() would do a migrate_disable() and there's no imbalance. But now that I dig into this in detail, I can see there might be another instance of this imbalance bug, if CONFIG_PREEMPTION is disabled, but CONFIG_PREEMPT_COUNT is enabled, which seems to be possible in some debug scenarios. Because then preempt_disable()/preempt_enable() still manipulate the preempt counter and compiling them out in __slab_alloc() will cause imbalance. So I think the guards in __slab_alloc() should be using CONFIG_PREEMPT_COUNT instead of CONFIG_PREEMPT to be correct on all configs. I dare not remove them completely :)