Re: lockdep warning in 5.9 caused by "serial: 8250_dw: Fix common clocks usage race condition"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hans,

On Fri, Sep 11, 2020 at 11:03:41PM +0200, Hans de Goede wrote:
> Hi Serge,
> 
> I noticed that on various Cherry Trail (x86) based
> devices, which use a 8250_dw UART for their blue-tooth,
> I'm now (with 5.9) seeing a lockdep warning on suspend/resume",
> see below.
> 
> This is caused by commit cc816969d7b5 ("serial: 8250_dw: Fix common
> clocks usage race condition"). If I revert that commit the lockdep
> warning goes away.

Yes, suspending the port will definitely cause the deadlock. I can see that now
even without the lockdep warning.( Alas I haven't tested the system suspend
before releasing the patchset. The difficulty is to figure out how to fix it
better. Moving the clock-change callback registration/de-registration to the
probe/remove method causes another problem. I'll need to give it some time to
find a proper solution.

-Sergey

> 
> Regards,
> 
> Hans
> 
> Here is the promised lockdep output:
> 
> [  843.852886] ======================================================
> [  843.852888] WARNING: possible circular locking dependency detected
> [  843.852894] 5.9.0-rc3+ #38 Tainted: G        W   E
> [  843.852896] ------------------------------------------------------
> [  843.852900] systemd-sleep/1628 is trying to acquire lock:
> [  843.852905] ffff935e30344d80 ((work_completion)(&data->clk_work)){+.+.}-{0:0}, at: __flush_work+0x2e6/0x460
> [  843.852922]
>                but task is already holding lock:
> [  843.852924] ffff935e302f1420 (&port->mutex){+.+.}-{3:3}, at: uart_suspend_port+0x3f/0x1f0
> [  843.852938]
>                which lock already depends on the new lock.
> 
> [  843.852940]
>                the existing dependency chain (in reverse order) is:
> [  843.852943]
>                -> #1 (&port->mutex){+.+.}-{3:3}:
> [  843.852954]        __mutex_lock+0x7b/0x820
> [  843.852959]        serial8250_update_uartclk+0x34/0x1b0
> [  843.852965]        process_one_work+0x23f/0x570
> [  843.852969]        worker_thread+0x55/0x3c0
> [  843.852974]        kthread+0x138/0x160
> [  843.852981]        ret_from_fork+0x22/0x30
> [  843.852984]
>                -> #0 ((work_completion)(&data->clk_work)){+.+.}-{0:0}:
> [  843.852995]        __lock_acquire+0x1238/0x2080
> [  843.852999]        lock_acquire+0xa7/0x3d0
> [  843.853003]        __flush_work+0x30d/0x460
> [  843.853008]        uart_suspend_port+0x1b9/0x1f0
> [  843.853014]        dw8250_suspend+0x10/0x20
> [  843.853020]        dpm_run_callback+0x51/0xc0
> [  843.853025]        __device_suspend+0x1f9/0x450
> [  843.853029]        dpm_suspend+0x13d/0x1e0
> [  843.853033]        dpm_suspend_start+0x77/0x80
> [  843.853040]        suspend_devices_and_enter+0xe6/0x620
> [  843.853044]        pm_suspend.cold+0x333/0x389
> [  843.853048]        state_store+0x71/0xd0
> [  843.853056]        kernfs_fop_write+0xdc/0x1c0
> [  843.853063]        vfs_write+0xec/0x230
> [  843.853068]        ksys_write+0x58/0xd0
> [  843.853075]        do_syscall_64+0x33/0x40
> [  843.853079]        entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  843.853081]
>                other info that might help us debug this:
> 
> [  843.853083]  Possible unsafe locking scenario:
> 
> [  843.853087]        CPU0                    CPU1
> [  843.853090]        ----                    ----
> [  843.853091]   lock(&port->mutex);
> [  843.853095]                                lock((work_completion)(&data->clk_work));
> [  843.853098]                                lock(&port->mutex);
> [  843.853101]   lock((work_completion)(&data->clk_work));
> [  843.853105]
>                 *** DEADLOCK ***
> 
> [  843.853109] 7 locks held by systemd-sleep/1628:
> [  843.853111]  #0: ffff935e302a9488 (sb_writers#4){.+.+}-{0:0}, at: vfs_write+0x1a0/0x230
> [  843.853123]  #1: ffff935e0e56a290 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write+0xa5/0x1c0
> [  843.853134]  #2: ffff935e392a7848 (kn->active#206){.+.+}-{0:0}, at: kernfs_fop_write+0xad/0x1c0
> [  843.853144]  #3: ffffffffb487b0b0 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend.cold+0xee/0x389
> [  843.853155]  #4: ffffffffb4a42370 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_s2idle_begin+0xa/0x10
> [  843.853167]  #5: ffff935e39ee29a0 (&dev->mutex){....}-{3:3}, at: __device_suspend+0xe8/0x450
> [  843.853178]  #6: ffff935e302f1420 (&port->mutex){+.+.}-{3:3}, at: uart_suspend_port+0x3f/0x1f0
> [  843.853187]
>                stack backtrace:
> [  843.853195] CPU: 3 PID: 1628 Comm: systemd-sleep Tainted: G        W   E     5.9.0-rc3+ #38
> [  843.853199] Hardware name: Acer One S1003/One S1003, BIOS V1.19 03/20/2018
> [  843.853203] Call Trace:
> [  843.853213]  dump_stack+0x8b/0xb8
> [  843.853222]  check_noncircular+0x134/0x150
> [  843.853234]  __lock_acquire+0x1238/0x2080
> [  843.853244]  ? lock_acquire+0xa7/0x3d0
> [  843.853252]  lock_acquire+0xa7/0x3d0
> [  843.853257]  ? __flush_work+0x2e6/0x460
> [  843.853265]  ? lockdep_hardirqs_on_prepare+0x116/0x190
> [  843.853270]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
> [  843.853276]  ? trace_hardirqs_on+0x1c/0xe0
> [  843.853284]  __flush_work+0x30d/0x460
> [  843.853289]  ? __flush_work+0x2e6/0x460
> [  843.853295]  ? sched_clock+0x5/0x10
> [  843.853300]  ? sched_clock_cpu+0xc/0xb0
> [  843.853311]  ? __mutex_unlock_slowpath+0x35/0x270
> [  843.853321]  ? clk_notifier_unregister+0xf8/0x120
> [  843.853328]  uart_suspend_port+0x1b9/0x1f0
> [  843.853337]  ? acpi_pm_notify_work_func+0x30/0x30
> [  843.853342]  dw8250_suspend+0x10/0x20
> [  843.853346]  dpm_run_callback+0x51/0xc0
> [  843.853353]  __device_suspend+0x1f9/0x450
> [  843.853360]  dpm_suspend+0x13d/0x1e0
> [  843.853367]  dpm_suspend_start+0x77/0x80
> [  843.853373]  suspend_devices_and_enter+0xe6/0x620
> [  843.853381]  ? rcu_read_lock_sched_held+0x3f/0x80
> [  843.853388]  pm_suspend.cold+0x333/0x389
> [  843.853394]  state_store+0x71/0xd0
> [  843.853405]  kernfs_fop_write+0xdc/0x1c0
> [  843.853413]  vfs_write+0xec/0x230
> [  843.853420]  ksys_write+0x58/0xd0
> [  843.853427]  do_syscall_64+0x33/0x40
> [  843.853432]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  843.853441] RIP: 0033:0x7f2a99d184b7
> [  843.853450] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> [  843.853453] RSP: 002b:00007ffedaca0ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [  843.853459] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f2a99d184b7
> [  843.853462] RDX: 0000000000000004 RSI: 00007ffedaca0fe0 RDI: 0000000000000004
> [  843.853465] RBP: 00007ffedaca0fe0 R08: 000000000000c0d0 R09: 00007f2a99daf0c0
> [  843.853468] R10: 00007f2a99daefc0 R11: 0000000000000246 R12: 0000000000000004
> [  843.853471] R13: 000055f1a8c67330 R14: 0000000000000004 R15: 00007f2a99dec720
> 



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux