On Fri, Apr 28, 2017 at 06:27:28PM +0200, Sebastian Andrzej Siewior wrote: > Upstream commit dc434e056fe1dada20df7ba07f32739d3a701adf > > The setup/remove_state/instance() functions in the hotplug core code are > serialized against concurrent CPU hotplug, but unfortunately not serialized > against themself. > > As a consequence a concurrent invocation of these function results in > corruption of the callback machinery because two instances try to invoke > callbacks on remote cpus at the same time. This results in missing callback > invocations and initiator threads waiting forever on the completion. > > The obvious solution to replace get_cpu_online() with cpu_hotplug_begin() > is not possible because at least one callsite calls into these functions > from a get_online_cpu() locked region. > > Extend the protection scope of the cpuhp_state_mutex from solely protecting > the state arrays to cover the callback invocation machinery as well. > > Fixes: 5b7aa87e0482 ("cpu/hotplug: Implement setup/removal interface") > Reported-and-tested-by: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Cc: hpa@xxxxxxxxx > Cc: mingo@xxxxxxxxxx > Cc: akpm@xxxxxxxxxxxxxxxxxxxx > Cc: torvalds@xxxxxxxxxxxxxxxxxxxx > Link: http://lkml.kernel.org/r/20170314150645.g4tdyoszlcbajmna@xxxxxxxxxxxxx > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > --- > kernel/cpu.c | 28 ++++++++++++++-------------- > 1 file changed, 14 insertions(+), 14 deletions(-) Doesn't apply to 4.9-stable, can you provide a backport there if you want it applied to that tree (I think you do...) thanks, greg k-h