console_cpu_notify can cause scheduling BUG during CPU hotplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I've run into a crash scenario during CPU hotplug on ARM/MSM where we BUG() due to a schedule while atomic in v2.6.38-rc6. The issue appears to be that the console cpu notifier can block on a semaphore during cpu_stopper_thread's atomic code path. Preemption is explicitly disabled in cpu_stopper_thread.

The suspected path was added with this commit:

commit 034260d6779087431a8b2f67589c68b919299e5c
Author: Kevin Cernekee <cernekee@xxxxxxxxx>
Date:   Thu Jun 3 22:11:25 2010 -0700

    printk: fix delayed messages from CPU hotplug events

I was curious if this scenario was accounted for in the design of the console CPU notifier. One workaround for this problem is to remove CPU_DEAD from the possible actions in console_cpu_notify(). In fact, v1-v4 of the patch above did not have CPU_DEAD, CPU_DYING or CPU_DOWN_FAILED in the list of actions. I wasn't able to track down why the other cases were added in the final patch.

Crash log:

<3>[   21.408237] BUG: scheduling while atomic: migration/1/371/0x00000002
<4>[   21.408247] Modules linked in:
<4>[ 21.408286] [<c0050e40>] (unwind_backtrace+0x0/0x128) from [<c056748c>] (schedule+0x9c/0x6c4) <4>[ 21.408303] [<c056748c>] (schedule+0x9c/0x6c4) from [<c0567d04>] (schedule_timeout+0x1c/0x208) <4>[ 21.408319] [<c0567d04>] (schedule_timeout+0x1c/0x208) from [<c0568fac>] (__down+0x68/0x98) <4>[ 21.408337] [<c0568fac>] (__down+0x68/0x98) from [<c00d844c>] (down+0x2c/0x3c) <4>[ 21.408354] [<c00d844c>] (down+0x2c/0x3c) from [<c00bb23c>] (console_lock+0x38/0x60) <4>[ 21.408377] [<c00bb23c>] (console_lock+0x38/0x60) from [<c0564c80>] (console_cpu_notify+0x20/0x2c) <4>[ 21.408394] [<c0564c80>] (console_cpu_notify+0x20/0x2c) from [<c00d8488>] (notifier_call_chain+0x2c/0x70) <4>[ 21.408410] [<c00d8488>] (notifier_call_chain+0x2c/0x70) from [<c00bc318>] (__cpu_notify+0x24/0x3c) <4>[ 21.408425] [<c00bc318>] (__cpu_notify+0x24/0x3c) from [<c0552e7c>] (take_cpu_down+0x2c/0x34) <4>[ 21.408444] [<c0552e7c>] (take_cpu_down+0x2c/0x34) from [<c00f34d4>] (stop_machine_cpu_stop+0xc0/0x11c) <4>[ 21.408462] [<c00f34d4>] (stop_machine_cpu_stop+0xc0/0x11c) from [<c00f337c>] (cpu_stopper_thread+0xc8/0x160) <4>[ 21.408482] [<c00f337c>] (cpu_stopper_thread+0xc8/0x160) from [<c00d30b0>] (kthread+0x80/0x88) <4>[ 21.408498] [<c00d30b0>] (kthread+0x80/0x88) from [<c004b6a0>] (kernel_thread_exit+0x0/0x8)

Thanks,
Mike

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux