In a heavily loaded system, when frequently turning on and off CPUs, the kernel will detect soft-lockups on multiple CPUs. The detailed bug report is at https://lkml.org/lkml/2011/8/24/185. The root cause is that brlock functions, i.e. br_write_lock() and br_write_unlock(), only locks/unlocks the per-CPU spinlock of CPUs that are online, which means, if one online CPU is locked and then goes offline, any later unlocking operation happens during its offline state will not touch it; and when it goes online again, it has the incorrect brlock state. This has been verified in current kernel. I can reproduce this bug on the intact 3.1 kernel. After my patch applied, I've ran an 8-hours long test(test script provided by the bug reporter), and no soft lockup happened again. Signed-off-by: Cong Meng <mc@xxxxxxxxxxxxxxxxxx> Reported-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> --- include/linux/lglock.h | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/lglock.h b/include/linux/lglock.h index f549056..08b9e84 100644 --- a/include/linux/lglock.h +++ b/include/linux/lglock.h @@ -27,8 +27,8 @@ #define br_lock_init(name) name##_lock_init() #define br_read_lock(name) name##_local_lock() #define br_read_unlock(name) name##_local_unlock() -#define br_write_lock(name) name##_global_lock_online() -#define br_write_unlock(name) name##_global_unlock_online() +#define br_write_lock(name) name##_global_lock() +#define br_write_unlock(name) name##_global_unlock() #define DECLARE_BRLOCK(name) DECLARE_LGLOCK(name) #define DEFINE_BRLOCK(name) DEFINE_LGLOCK(name) -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html