On Tuesday, September 09, 2014 10:44:17 PM Lan Tianyu wrote: > There is a dead lock between between cpu hotplug and handler of ACPI Processor > Power notify event. Cpu hotplug lock is held and then hold cpuidle lock during cpu hotplug. > During ACPI processor power notify event, acpi_processor_cst_has_chaged() does converse > thing and cause dead lock. Just like the following log shows. Isn't this patch equivalent to commit 6726655dfdd2 (ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock)? > [ 91.456670] ====================================================== > [ 91.457160] [ INFO: possible circular locking dependency detected ] > [ 91.457658] 3.17.0-rc3+ #15 Not tainted > [ 91.457968] ------------------------------------------------------- > [ 91.458463] kworker/0:1/75 is trying to acquire lock: > [ 91.458866] (cpu_hotplug.lock){++++++}, at: [<ffffffff810c3334>] get_online_cpus+0x24/0x70 > [ 91.459600] > [ 91.459600] but task is already holding lock: > [ 91.460062] (cpuidle_lock){+.+.+.}, at: [<ffffffff81666147>] cpuidle_pause_and_lock+0x17/0x40 > [ 91.460810] > [ 91.460810] which lock already depends on the new lock. > [ 91.460810] > [ 91.461451] > [ 91.461451] the existing dependency chain (in reverse order) is: > [ 91.462038] > [ 91.462038] -> #2 (cpuidle_lock){+.+.+.}: > [ 91.462442] [<ffffffff8110cba4>] lock_acquire+0xa4/0x120 > [ 91.462931] [<ffffffff817eb4de>] mutex_lock_nested+0x4e/0x3b0 > [ 91.463453] [<ffffffff81666147>] cpuidle_pause_and_lock+0x17/0x40 > [ 91.464000] [<ffffffff8148ea5b>] acpi_processor_hotplug+0x49/0x8d > [ 91.464548] [<ffffffff8148cc9f>] acpi_cpu_soft_notify+0xaf/0xe4 > [ 91.465081] [<ffffffff810e20ac>] notifier_call_chain+0x4c/0x70 > [ 91.465609] [<ffffffff810e217e>] __raw_notifier_call_chain+0xe/0x10 > [ 91.466169] [<ffffffff810c3443>] cpu_notify+0x23/0x50 > [ 91.466637] [<ffffffff810c36a0>] _cpu_up+0x150/0x160 > [ 91.467098] [<ffffffff817dba6f>] enable_nonboot_cpus+0xaf/0x170 > [ 91.467632] [<ffffffff81113085>] hibernation_snapshot+0x275/0x380 > [ 91.468181] [<ffffffff81113990>] hibernate+0x160/0x210 > [ 91.468655] [<ffffffff81111314>] state_store+0xe4/0xf0 > [ 91.469129] [<ffffffff813eb16f>] kobj_attr_store+0xf/0x20 > [ 91.469624] [<ffffffff812aba54>] sysfs_kf_write+0x44/0x60 > [ 91.470118] [<ffffffff812ab357>] kernfs_fop_write+0xe7/0x170 > [ 91.470632] [<ffffffff81230da7>] vfs_write+0xb7/0x1f0 > [ 91.471100] [<ffffffff81231959>] SyS_write+0x49/0xb0 > [ 91.471561] [<ffffffff817ee429>] system_call_fastpath+0x16/0x1b > [ 91.472096] > [ 91.472096] -> #1 (cpu_hotplug.lock#2){+.+.+.}: > [ 91.472556] [<ffffffff8110cba4>] lock_acquire+0xa4/0x120 > [ 91.473044] [<ffffffff817eb4de>] mutex_lock_nested+0x4e/0x3b0 > [ 91.473564] [<ffffffff810c34df>] cpu_hotplug_begin+0x4f/0x80 > [ 91.474078] [<ffffffff810c3584>] _cpu_up+0x34/0x160 > [ 91.474532] [<ffffffff810c3709>] cpu_up+0x59/0x80 > [ 91.474973] [<ffffffff81d8170c>] smp_init+0x86/0x88 > [ 91.475429] [<ffffffff81d5e203>] kernel_init_freeable+0x16c/0x27b > [ 91.475976] [<ffffffff817da77e>] kernel_init+0xe/0xf0 > [ 91.476443] [<ffffffff817ee37c>] ret_from_fork+0x7c/0xb0 > [ 91.476930] > [ 91.476930] -> #0 (cpu_hotplug.lock){++++++}: > [ 91.477359] [<ffffffff8110c0b0>] __lock_acquire+0x1760/0x1a70 > [ 91.477880] [<ffffffff8110cba4>] lock_acquire+0xa4/0x120 > [ 91.478367] [<ffffffff810c335a>] get_online_cpus+0x4a/0x70 > [ 91.478868] [<ffffffff8148eb12>] acpi_processor_cst_has_changed+0x73/0x17d > [ 91.479475] [<ffffffff8148cd5e>] acpi_processor_notify+0x8a/0xe2 > [ 91.480014] [<ffffffff814743fb>] acpi_ev_notify_dispatch+0x44/0x5c > [ 91.480569] [<ffffffff8146019a>] acpi_os_execute_deferred+0x14/0x20 > [ 91.481130] [<ffffffff810dba1f>] process_one_work+0x1df/0x4d0 > [ 91.481652] [<ffffffff810dbe2b>] worker_thread+0x11b/0x490 > [ 91.482152] [<ffffffff810e100d>] kthread+0xed/0x110 > [ 91.482606] [<ffffffff817ee37c>] ret_from_fork+0x7c/0xb0 > [ 91.483094] > [ 91.483094] other info that might help us debug this: > [ 91.483094] > [ 91.483721] Chain exists of: > [ 91.483721] cpu_hotplug.lock --> cpu_hotplug.lock#2 --> cpuidle_lock > [ 91.483721] > [ 91.484486] Possible unsafe locking scenario: > [ 91.484486] > [ 91.491113] CPU0 CPU1 > [ 91.494496] ---- ---- > [ 91.497833] lock(cpuidle_lock); > [ 91.501083] lock(cpu_hotplug.lock#2); > [ 91.504572] lock(cpuidle_lock); > [ 91.507944] lock(cpu_hotplug.lock); > [ 91.511115] > [ 91.511115] *** DEADLOCK *** > [ 91.511115] > [ 91.519889] 3 locks held by kworker/0:1/75: > [ 91.522918] #0: ("kacpi_notify"){++++..}, at: [<ffffffff810db9bd>] process_one_work+0x17d/0x4d0 > [ 91.526445] #1: ((&dpc->work)){+.+...}, at: [<ffffffff810db9bd>] process_one_work+0x17d/0x4d0 > [ 91.530031] #2: (cpuidle_lock){+.+.+.}, at: [<ffffffff81666147>] cpuidle_pause_and_lock+0x17/0x40 > [ 91.533653] > [ 91.533653] stack backtrace: > [ 91.539712] CPU: 0 PID: 75 Comm: kworker/0:1 Not tainted 3.17.0-rc3+ #15 > [ 91.543086] Hardware name: Intel Corporation CHERRYVIEW A3 PLATFORM/Braswell CRB, BIOS BRASWEL1.X64.0035.R00.1409041111 09/04/2014 > [ 91.546930] Workqueue: kacpi_notify acpi_os_execute_deferred > [ 91.550405] ffffffff82656d40 ffff880078883b68 ffffffff817e44fd ffffffff82656b90 > [ 91.554096] ffff880078883ba8 ffffffff817dff2a ffff880078883c00 ffff88007884e218 > [ 91.557807] 0000000000000002 0000000000000003 ffff88007884e218 ffff88007884da50 > [ 91.561524] Call Trace: > [ 91.564833] [<ffffffff817e44fd>] dump_stack+0x45/0x56 > [ 91.568350] [<ffffffff817dff2a>] print_circular_bug+0x200/0x20f > [ 91.571946] [<ffffffff8110c0b0>] __lock_acquire+0x1760/0x1a70 > [ 91.575529] [<ffffffff8110a42a>] ? mark_held_locks+0x6a/0x90 > [ 91.579100] [<ffffffff810856f9>] ? flat_send_IPI_allbutself+0xe9/0x140 > [ 91.582747] [<ffffffff8110cba4>] lock_acquire+0xa4/0x120 > [ 91.586323] [<ffffffff810c3334>] ? get_online_cpus+0x24/0x70 > [ 91.589927] [<ffffffff810c335a>] get_online_cpus+0x4a/0x70 > [ 91.593520] [<ffffffff810c3334>] ? get_online_cpus+0x24/0x70 > [ 91.597125] [<ffffffff8148eb12>] acpi_processor_cst_has_changed+0x73/0x17d > [ 91.600840] [<ffffffff8148cd5e>] acpi_processor_notify+0x8a/0xe2 > [ 91.604508] [<ffffffff814743fb>] acpi_ev_notify_dispatch+0x44/0x5c > [ 91.608188] [<ffffffff8146019a>] acpi_os_execute_deferred+0x14/0x20 > [ 91.611856] [<ffffffff810dba1f>] process_one_work+0x1df/0x4d0 > [ 91.615495] [<ffffffff810db9bd>] ? process_one_work+0x17d/0x4d0 > [ 91.619121] [<ffffffff810dbe2b>] worker_thread+0x11b/0x490 > [ 91.622713] [<ffffffff810dbd10>] ? process_one_work+0x4d0/0x4d0 > [ 91.626347] [<ffffffff810e100d>] kthread+0xed/0x110 > [ 91.629907] [<ffffffff810e0f20>] ? kthread_create_on_node+0x200/0x200 > [ 91.633606] [<ffffffff817ee37c>] ret_from_fork+0x7c/0xb0 > [ 91.637238] [<ffffffff810e0f20>] ? kthread_create_on_node+0x200/0x200 > > This patch is to change the sequence of holding cpu hotplug > and cpu idle lock in the acpi_processor_cst_has_changed() to avoid the > dead lock. > > Signed-off-by: Lan Tianyu <tianyu.lan@xxxxxxxxx> > --- > drivers/acpi/processor_idle.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index 3dca36d..385ec5f 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -1071,9 +1071,9 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr) > > if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) { > > - cpuidle_pause_and_lock(); > /* Protect against cpu-hotplug */ > get_online_cpus(); > + cpuidle_pause_and_lock(); > > /* Disable all cpuidle devices */ > for_each_online_cpu(cpu) { > @@ -1100,8 +1100,9 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr) > cpuidle_enable_device(dev); > } > } > - put_online_cpus(); > + > cpuidle_resume_and_unlock(); > + put_online_cpus(); > } > > return 0; > -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html