This patch series was created to help catching a rather long standing problem with smp_call_function_any() and friends. Very rarely a remote cpu seems not to execute a queued function and the cpu queueing that function request will wait forever for the CSD lock to be released by the remote cpu. This problem has been observed primarily when running as a guest on top of KVM or Xen, but there are reports of the same pattern for the bare metal case, too. It seems to exist since about 2 years now, and there is not much data available. What is known up to now is that resending an IPI to the remote cpu is helping. The patches are adding more debug data being printed in a hang situation using a kernel with CONFIG_CSD_LOCK_WAIT_DEBUG configured. Additionally the debug coding can be controlled via a new parameter in order to make it easier to use such a kernel in a production environment without too much negative performance impact. Per default the debugging additions will be switched off and they can be activated via the new boot parameter: csdlock_debug=1 will switch on the basic debugging and IPI resend csdlock_debug=ext will add additional data printed out in a hang situation, but this option will have a larger impact on performance. I hope that the "ext" setting will help to find the root cause of the problem. Juergen Gross (4): kernel/smp: add boot parameter for controlling CSD lock debugging kernel/smp: prepare more CSD lock debugging kernel/smp: add more data to CSD lock debugging kernel/smp: fix flush_smp_call_function_queue() cpu offline detection .../admin-guide/kernel-parameters.txt | 10 + kernel/smp.c | 280 +++++++++++++++++- 2 files changed, 277 insertions(+), 13 deletions(-) -- 2.26.2