The patch titled Subject: mm/slab.c: protect cache_reap() against CPU and memory hot plug operations has been added to the -mm tree. Its filename is mm-slab-protect-cache_reap-against-cpu-and-memory-hot-plug-operations.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-slab-protect-cache_reap-against-cpu-and-memory-hot-plug-operations.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab-protect-cache_reap-against-cpu-and-memory-hot-plug-operations.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Laurent Dufour <ldufour@xxxxxxxxxxxxx> Subject: mm/slab.c: protect cache_reap() against CPU and memory hot plug operations 95402b382901 ("cpu-hotplug: replace per-subsystem mutexes with get_online_cpus()") remove the CPU_LOCK_ACQUIRE operation which was use to grap the cache_chain_mutex lock which was protecting cache_reap() against CPU hot plug operations. Later 18004c5d4084 ("mm, sl[aou]b: Use a common mutex definition") changed cache_chain_mutex to slab_mutex but this didn't help fixing the missing the cache_reap() protection against CPU hot plug operations. Here we are stopping the per cpu worker while holding the slab_mutex to ensure that cache_reap() is not running in our back and will not be triggered anymore for this cpu. This patch fixes that race leading to SLAB's data corruption when CPU hotplug are triggered. We hit it while doing partition migration on PowerVM leading to CPU reconfiguration through the CPU hotplug mechanism. This fix is covering kernel containing to the commit 6731d4f12315 ("slab: Convert to hotplug state machine"), ie 4.9.1, earlier kernel needs a slightly different patch. Link: http://lkml.kernel.org/r/20190311191701.24325-1-ldufour@xxxxxxxxxxxxx Signed-off-by: Laurent Dufour <ldufour@xxxxxxxxxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- --- a/mm/slab.c~mm-slab-protect-cache_reap-against-cpu-and-memory-hot-plug-operations +++ a/mm/slab.c @@ -1103,6 +1103,7 @@ static int slab_online_cpu(unsigned int static int slab_offline_cpu(unsigned int cpu) { + mutex_lock(&slab_mutex); /* * Shutdown cache reaper. Note that the slab_mutex is held so * that if cache_reap() is invoked it cannot do anything @@ -1112,6 +1113,7 @@ static int slab_offline_cpu(unsigned int cancel_delayed_work_sync(&per_cpu(slab_reap_work, cpu)); /* Now the cache_reaper is guaranteed to be not running. */ per_cpu(slab_reap_work, cpu).work.func = NULL; + mutex_unlock(&slab_mutex); return 0; } _ Patches currently in -mm which might be from ldufour@xxxxxxxxxxxxx are mm-slab-protect-cache_reap-against-cpu-and-memory-hot-plug-operations.patch