Hi, Rafael I think you know this issue. [PATCH 1] can trigger this dead lock because it is actually based on another GPE dead lock fixing series. I have fixed the dead lock in acpi_ev_gpe_detect() or acpi_ev_gpe_dispatch(). The problem is they haven't been upstreamed to ACPICA, so I couldn't post them here. I was thinking we can work this around by applying the acpi_os_wait_events_complete() enhancement support prior than applying this because it can only happen in suspend. But it seems this can also be triggered during boot. So we can have 3 choices here in order to merge this series: 1. Merging the GPE dead lock fix before it is merged in the ACPICA upstream. 2. Changing [PATCH 1] and do not hold EC lock currently (though it is racy, it is currently racy). 3. Reverting [PATCH 1-4] and wait until GPE dead lock fixed in ACPICA upstream. Which one do you prefer? IMO, we have several issues, their fixes form a dependency circle: 1. GPE dead lock: it may depends on DISPATCH_METHOD flushing (we shouldn't bump enabling status up in acpi_ev_asynch_enable_gpe()) 2. EC transaction flushing: it depends on the GPE dead lock 3. EC event polling: it depends on the EC transaction flushing, this is required to support EC event draining as mentioned in bugzilla 44161 4. DISPATCH_METHOD flushing: it depends on EC event polling, if we don't move EC query from the _Lxx/_Exx work queue, then it may block DISPATCH_METHOD flushing. So it seems we need to determine which one should be merged first. IMO, the GPE dead lock fix is the most basic one. Thanks and best regards -Lv > From: Rafael J. Wysocki [mailto:rjw@xxxxxxxxxxxxx] > Sent: Wednesday, November 19, 2014 5:20 AM > To: Kirill A. Shutemov > Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; linux-kernel@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx > Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag. > > On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote: > > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote: > > [cut] > > > > > Here's lockdep warning I see on -next: > > Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]? > > > > [ 0.510159] ====================================================== > > [ 0.510171] [ INFO: possible circular locking dependency detected ] > > [ 0.510185] 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66 Not tainted > > [ 0.510197] ------------------------------------------------------- > > [ 0.510209] swapper/3/0 is trying to acquire lock: > > [ 0.510219] (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc > > [ 0.510254] > > [ 0.510254] but task is already holding lock: > > [ 0.510266] (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10 > > [ 0.510296] > > [ 0.510296] which lock already depends on the new lock. > > [ 0.510296] > > [ 0.510312] > > [ 0.510312] the existing dependency chain (in reverse order) is: > > [ 0.510327] > > [ 0.510327] -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}: > > [ 0.510344] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0 > > [ 0.510364] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70 > > [ 0.510381] [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10 > > [ 0.510398] [<ffffffff814e31e8>] acpi_enable_gpe+0x22/0x68 > > [ 0.510416] [<ffffffff814d5b24>] acpi_ec_start+0x66/0x87 > > [ 0.510432] [<ffffffff81afc771>] ec_install_handlers+0x41/0xa4 > > [ 0.510449] [<ffffffff823e72b9>] acpi_ec_ecdt_probe+0x1a9/0x1ea > > [ 0.510466] [<ffffffff823e6ae3>] acpi_init+0x8b/0x26e > > [ 0.510480] [<ffffffff81002148>] do_one_initcall+0xd8/0x210 > > [ 0.510496] [<ffffffff8239f1dc>] kernel_init_freeable+0x1f5/0x282 > > [ 0.510513] [<ffffffff81af1a1e>] kernel_init+0xe/0xf0 > > [ 0.510527] [<ffffffff81b08cfc>] ret_from_fork+0x7c/0xb0 > > [ 0.510542] > > [ 0.510542] -> #0 (&(&ec->lock)->rlock){-.....}: > > [ 0.510558] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220 > > [ 0.510574] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0 > > [ 0.510589] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70 > > [ 0.510604] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc > > [ 0.510620] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143 > > [ 0.510636] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f > > [ 0.510652] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38 > > [ 0.510669] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31 > > [ 0.510684] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540 > > [ 0.510702] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70 > > [ 0.510718] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140 > > [ 0.510733] [<ffffffff81075a22>] handle_irq+0x22/0x40 > > [ 0.510748] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0 > > [ 0.510762] [<ffffffff81b09bb2>] ret_from_intr+0x0/0x1a > > [ 0.510777] [<ffffffff8107e783>] default_idle+0x23/0x260 > > [ 0.510792] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20 > > [ 0.510806] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0 > > [ 0.510821] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0 > > [ 0.510840] > > [ 0.510840] other info that might help us debug this: > > [ 0.510840] > > [ 0.510856] Possible unsafe locking scenario: > > [ 0.510856] > > [ 0.510868] CPU0 CPU1 > > [ 0.510877] ---- ---- > > [ 0.510886] lock(&(*(&acpi_gbl_gpe_lock))->rlock); > > [ 0.510898] lock(&(&ec->lock)->rlock); > > [ 0.510912] lock(&(*(&acpi_gbl_gpe_lock))->rlock); > > [ 0.510927] lock(&(&ec->lock)->rlock); > > [ 0.510938] > > [ 0.510938] *** DEADLOCK *** > > [ 0.510938] > > [ 0.510953] 1 lock held by swapper/3/0: > > [ 0.510961] #0: (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10 > > [ 0.510990] > > [ 0.510990] stack backtrace: > > [ 0.511004] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66 > > [ 0.511021] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013 > > [ 0.511035] ffffffff82cb2f70 ffff88011e2c3bb8 ffffffff81afc316 0000000000000011 > > [ 0.511055] ffffffff82cb2f70 ffff88011e2c3c08 ffffffff81afae11 0000000000000001 > > [ 0.511074] ffff88011e2c3c68 ffff88011e2c3c08 ffff8801193f92d0 ffff8801193f9b20 > > [ 0.511094] Call Trace: > > [ 0.511101] <IRQ> [<ffffffff81afc316>] dump_stack+0x4c/0x6e > > [ 0.511125] [<ffffffff81afae11>] print_circular_bug+0x2b2/0x2c3 > > [ 0.511142] [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220 > > [ 0.511159] [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0 > > [ 0.511176] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc > > [ 0.511192] [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70 > > [ 0.511209] [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc > > [ 0.511225] [<ffffffff814ea192>] ? acpi_hw_write+0x4b/0x52 > > [ 0.511241] [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc > > [ 0.511258] [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143 > > [ 0.511274] [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f > > [ 0.511292] [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38 > > [ 0.511309] [<ffffffff814cc8ee>] acpi_irq+0x16/0x31 > > [ 0.511325] [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540 > > [ 0.511342] [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70 > > [ 0.511357] [<ffffffff81171e98>] ? handle_fasteoi_irq+0x28/0x140 > > [ 0.511372] [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140 > > [ 0.511388] [<ffffffff81075a22>] handle_irq+0x22/0x40 > > [ 0.511402] [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0 > > [ 0.511417] [<ffffffff81b09bb2>] common_interrupt+0x72/0x72 > > [ 0.511428] <EOI> [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10 > > [ 0.511454] [<ffffffff81154f3d>] ? trace_hardirqs_on+0xd/0x10 > > [ 0.511468] [<ffffffff8107e783>] default_idle+0x23/0x260 > > [ 0.511482] [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20 > > [ 0.511496] [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0 > > [ 0.511512] [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0 > > > > > > > > -- > I speak only for myself. > Rafael J. Wysocki, Intel Open Source Technology Center. ��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f