Dear Angelo, Dear Maintainers, thank you for your help supporting MT8192 platform. Unfortunataly, the issue with Asurada Spherion discussed in this thread persists with v6.8-rc4. I'm including an updated dmesg snippet below. Please let me know if I can provide any further information to help debug this issue. Thank you Leonard [ 688.910709] PM: suspend entry (deep) [ 688.978833] Filesystems sync: 0.064 seconds [ 688.987422] Freezing user space processes [ 688.999288] Freezing user space processes completed (elapsed 0.007 seconds) [ 689.006391] OOM killer disabled. [ 689.009632] Freezing remaining freezable tasks [ 689.016980] Freezing remaining freezable tasks completed (elapsed 0.002 seconds) [ 689.024376] printk: Suspending console(s) (use no_console_suspend to debug) [ 689.181542] queueing ieee80211 work while going to suspend [ 689.301195] ======================================================== [ 689.301197] WARNING: possible irq lock inversion dependency detected [ 689.301199] 6.8.0-rc4-cos-mt9 #1 Not tainted [ 689.301203] -------------------------------------------------------- [ 689.301204] systemd-sleep/4626 just changed the state of lock: [ 689.301208] ffff610985759560 (&pcie->irq_lock){+...}-{2:2}, at: mtk_pcie_suspend_noirq+0xd0/0x14c [ 689.301228] but this lock was taken by another, HARDIRQ-safe lock in the past: [ 689.301230] (&irq_desc_lock_class){-.-.}-{2:2} [ 689.301234] and interrupts could create inverse lock ordering between them. [ 689.301236] other info that might help us debug this: [ 689.301238] Possible interrupt unsafe locking scenario: [ 689.301240] CPU0 CPU1 [ 689.301241] ---- ---- [ 689.301242] lock(&pcie->irq_lock); [ 689.301246] local_irq_disable(); [ 689.301247] lock(&irq_desc_lock_class); [ 689.301251] lock(&pcie->irq_lock); [ 689.301254] <Interrupt> [ 689.301255] lock(&irq_desc_lock_class); [ 689.301258] *** DEADLOCK *** [ 689.301259] 4 locks held by systemd-sleep/4626: [ 689.301262] #0: ffff610987c133f8 (sb_writers#6){.+.+}-{0:0}, at: vfs_write+0x2ac/0x340 [ 689.301278] #1: ffff610a2d9a4688 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf0/0x1b0 [ 689.301289] #2: ffff610980b74158 (kn->active#115){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xf8/0x1b0 [ 689.301300] #3: ffffd2e4d5792cc0 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x98/0x330 [ 689.301313] the shortest dependencies between 2nd lock and 1st lock: [ 689.301315] -> (&irq_desc_lock_class){-.-.}-{2:2} { [ 689.301321] IN-HARDIRQ-W at: [ 689.301324] lock_acquire+0x11c/0x324 [ 689.301329] _raw_spin_lock+0x48/0x60 [ 689.301337] handle_fasteoi_irq+0x2c/0x23c [ 689.301342] generic_handle_domain_irq+0x2c/0x44 [ 689.301348] gic_handle_irq+0x16c/0x29c [ 689.301352] call_on_irq_stack+0x24/0x4c [ 689.301357] do_interrupt_handler+0x80/0x84 [ 689.301362] el1_interrupt+0x44/0x98 [ 689.301368] el1h_64_irq_handler+0x18/0x24 [ 689.301374] el1h_64_irq+0x78/0x7c [ 689.301377] default_idle_call+0xa0/0x14c [ 689.301381] do_idle+0x270/0x2d0 [ 689.301387] cpu_startup_entry+0x38/0x3c [ 689.301392] rest_init+0x118/0x1a8 [ 689.301395] arch_post_acpi_subsys_init+0x0/0x8 [ 689.301403] start_kernel+0x5b8/0x6ac [ 689.301407] __primary_switched+0xb8/0xc0 [ 689.301411] IN-SOFTIRQ-W at: [ 689.301414] lock_acquire+0x11c/0x324 [ 689.301418] _raw_spin_lock+0x48/0x60 [ 689.301423] handle_fasteoi_irq+0x2c/0x23c [ 689.301427] generic_handle_domain_irq+0x2c/0x44 [ 689.301432] gic_handle_irq+0x16c/0x29c [ 689.301435] do_interrupt_handler+0x50/0x84 [ 689.301439] el1_interrupt+0x44/0x98 [ 689.301445] el1h_64_irq_handler+0x18/0x24 [ 689.301451] el1h_64_irq+0x78/0x7c [ 689.301454] _nohz_idle_balance.isra.0+0xd4/0x388 [ 689.301458] run_rebalance_domains+0x64/0x74 [ 689.301462] __do_softirq+0x148/0x518 [ 689.301465] ____do_softirq+0x10/0x1c [ 689.301469] call_on_irq_stack+0x24/0x4c [ 689.301473] do_softirq_own_stack+0x1c/0x2c [ 689.301477] __irq_exit_rcu+0x12c/0x148 [ 689.301484] irq_exit_rcu+0x10/0x38 [ 689.301489] el1_interrupt+0x48/0x98 [ 689.301495] el1h_64_irq_handler+0x18/0x24 [ 689.301500] el1h_64_irq+0x78/0x7c [ 689.301503] default_idle_call+0xa0/0x14c [ 689.301507] do_idle+0x270/0x2d0 [ 689.301511] cpu_startup_entry+0x38/0x3c [ 689.301517] rest_init+0x118/0x1a8 [ 689.301520] arch_post_acpi_subsys_init+0x0/0x8 [ 689.301525] start_kernel+0x5b8/0x6ac [ 689.301529] __primary_switched+0xb8/0xc0 [ 689.301533] INITIAL USE at: [ 689.301536] lock_acquire+0x11c/0x324 [ 689.301540] _raw_spin_lock_irqsave+0x68/0xbc [ 689.301545] __irq_get_desc_lock+0x58/0x98 [ 689.301551] irq_modify_status+0x38/0x140 [ 689.301555] irq_set_percpu_devid+0x70/0x94 [ 689.301561] gic_irq_domain_alloc+0x208/0x248 [ 689.301568] irq_domain_alloc_irqs_locked+0xf8/0x358 [ 689.301573] __irq_domain_alloc_irqs+0x6c/0xc0 [ 689.301578] gic_init_bases+0x358/0x6a4 [ 689.301585] gic_of_init+0x290/0x31c [ 689.301589] of_irq_init+0x304/0x390 [ 689.301596] irqchip_init+0x18/0x24 [ 689.301601] init_IRQ+0xb4/0x12c [ 689.301606] start_kernel+0x26c/0x6ac [ 689.301610] __primary_switched+0xb8/0xc0 [ 689.301614] } [ 689.301615] ... key at: [<ffffd2e4d65b8bc0>] irq_desc_lock_class+0x0/0x10 [ 689.301621] ... acquired at: [ 689.301623] _raw_spin_lock_irqsave+0x68/0xbc [ 689.301629] mtk_msi_bottom_irq_unmask+0x30/0x70 [ 689.301633] irq_chip_unmask_parent+0x1c/0x28 [ 689.301637] mtk_pcie_msi_irq_unmask+0x20/0x30 [ 689.301642] irq_enable+0x40/0x8c [ 689.301645] __irq_startup+0x78/0xa4 [ 689.301649] irq_startup+0xfc/0x164 [ 689.301653] __setup_irq+0x3f0/0x6bc [ 689.301656] request_threaded_irq+0xec/0x1a4 [ 689.301659] pcie_pme_probe+0xf4/0x1bc [ 689.301666] pcie_port_probe_service+0x38/0x64 [ 689.301670] really_probe+0x148/0x2ac [ 689.301676] __driver_probe_device+0x78/0x12c [ 689.301680] driver_probe_device+0x3c/0x160 [ 689.301683] __device_attach_driver+0xb8/0x138 [ 689.301687] bus_for_each_drv+0x80/0xdc [ 689.301691] __device_attach+0x9c/0x188 [ 689.301695] device_initial_probe+0x14/0x20 [ 689.301699] bus_probe_device+0xac/0xb0 [ 689.301702] device_add+0x5ac/0x768 [ 689.301708] device_register+0x20/0x30 [ 689.301714] pcie_portdrv_probe+0x340/0x5c4 [ 689.301718] local_pci_probe+0x40/0xa4 [ 689.301722] pci_device_probe+0xac/0x1f0 [ 689.301727] really_probe+0x148/0x2ac [ 689.301730] __driver_probe_device+0x78/0x12c [ 689.301734] driver_probe_device+0x3c/0x160 [ 689.301738] __device_attach_driver+0xb8/0x138 [ 689.301742] bus_for_each_drv+0x80/0xdc [ 689.301745] __device_attach+0x9c/0x188 [ 689.301748] device_attach+0x14/0x20 [ 689.301752] pci_bus_add_device+0x64/0xd4 [ 689.301756] pci_bus_add_devices+0x3c/0x88 [ 689.301760] pci_host_probe+0x44/0xbc [ 689.301764] mtk_pcie_probe+0x288/0x47c [ 689.301768] platform_probe+0x68/0xc0 [ 689.301773] really_probe+0x148/0x2ac [ 689.301777] __driver_probe_device+0x78/0x12c [ 689.301781] driver_probe_device+0x3c/0x160 [ 689.301784] __device_attach_driver+0xb8/0x138 [ 689.301788] bus_for_each_drv+0x80/0xdc [ 689.301791] __device_attach+0x9c/0x188 [ 689.301795] device_initial_probe+0x14/0x20 [ 689.301799] bus_probe_device+0xac/0xb0 [ 689.301802] deferred_probe_work_func+0x8c/0xc8 [ 689.301806] process_one_work+0x1ec/0x4dc [ 689.301812] worker_thread+0x1dc/0x3d4 [ 689.301817] kthread+0xfc/0x100 [ 689.301821] ret_from_fork+0x10/0x20 [ 689.301826] -> (&pcie->irq_lock){+...}-{2:2} { [ 689.301832] HARDIRQ-ON-W at: [ 689.301835] lock_acquire+0x11c/0x324 [ 689.301839] _raw_spin_lock+0x48/0x60 [ 689.301844] mtk_pcie_suspend_noirq+0xd0/0x14c [ 689.301849] dpm_run_callback+0x34/0x9c [ 689.301854] __device_suspend_noirq+0x68/0x1f0 [ 689.301859] dpm_noirq_suspend_devices+0x1d8/0x274 [ 689.301864] dpm_suspend_noirq+0x24/0x98 [ 689.301868] suspend_devices_and_enter+0x3b0/0x65c [ 689.301874] pm_suspend+0x1fc/0x330 [ 689.301880] state_store+0x80/0xec [ 689.301885] kobj_attr_store+0x18/0x2c [ 689.301889] sysfs_kf_write+0x4c/0x78 [ 689.301893] kernfs_fop_write_iter+0x120/0x1b0 [ 689.301896] vfs_write+0x224/0x340 [ 689.301902] ksys_write+0x6c/0x100 [ 689.301908] __arm64_sys_write+0x1c/0x28 [ 689.301914] invoke_syscall+0x48/0x114 [ 689.301920] el0_svc_common.constprop.0+0x40/0xe0 [ 689.301926] do_el0_svc+0x1c/0x28 [ 689.301932] el0_svc+0x50/0x108 [ 689.301937] el0t_64_sync_handler+0x100/0x12c [ 689.301943] el0t_64_sync+0x1a4/0x1a8 [ 689.301946] INITIAL USE at: [ 689.301948] lock_acquire+0x11c/0x324 [ 689.301952] _raw_spin_lock_irqsave+0x68/0xbc [ 689.301957] mtk_msi_bottom_irq_unmask+0x30/0x70 [ 689.301962] irq_chip_unmask_parent+0x1c/0x28 [ 689.301966] mtk_pcie_msi_irq_unmask+0x20/0x30 [ 689.301970] irq_enable+0x40/0x8c [ 689.301974] __irq_startup+0x78/0xa4 [ 689.301978] irq_startup+0xfc/0x164 [ 689.301982] __setup_irq+0x3f0/0x6bc [ 689.301984] request_threaded_irq+0xec/0x1a4 [ 689.301987] pcie_pme_probe+0xf4/0x1bc [ 689.301993] pcie_port_probe_service+0x38/0x64 [ 689.301997] really_probe+0x148/0x2ac [ 689.302001] __driver_probe_device+0x78/0x12c [ 689.302004] driver_probe_device+0x3c/0x160 [ 689.302008] __device_attach_driver+0xb8/0x138 [ 689.302012] bus_for_each_drv+0x80/0xdc [ 689.302015] __device_attach+0x9c/0x188 [ 689.302018] device_initial_probe+0x14/0x20 [ 689.302022] bus_probe_device+0xac/0xb0 [ 689.302026] device_add+0x5ac/0x768 [ 689.302031] device_register+0x20/0x30 [ 689.302037] pcie_portdrv_probe+0x340/0x5c4 [ 689.302041] local_pci_probe+0x40/0xa4 [ 689.302044] pci_device_probe+0xac/0x1f0 [ 689.302048] really_probe+0x148/0x2ac [ 689.302052] __driver_probe_device+0x78/0x12c [ 689.302056] driver_probe_device+0x3c/0x160 [ 689.302059] __device_attach_driver+0xb8/0x138 [ 689.302063] bus_for_each_drv+0x80/0xdc [ 689.302066] __device_attach+0x9c/0x188 [ 689.302070] device_attach+0x14/0x20 [ 689.302074] pci_bus_add_device+0x64/0xd4 [ 689.302077] pci_bus_add_devices+0x3c/0x88 [ 689.302080] pci_host_probe+0x44/0xbc [ 689.302083] mtk_pcie_probe+0x288/0x47c [ 689.302088] platform_probe+0x68/0xc0 [ 689.302093] really_probe+0x148/0x2ac [ 689.302096] __driver_probe_device+0x78/0x12c [ 689.302100] driver_probe_device+0x3c/0x160 [ 689.302104] __device_attach_driver+0xb8/0x138 [ 689.302108] bus_for_each_drv+0x80/0xdc [ 689.302111] __device_attach+0x9c/0x188 [ 689.302115] device_initial_probe+0x14/0x20 [ 689.302119] bus_probe_device+0xac/0xb0 [ 689.302122] deferred_probe_work_func+0x8c/0xc8 [ 689.302125] process_one_work+0x1ec/0x4dc [ 689.302130] worker_thread+0x1dc/0x3d4 [ 689.302135] kthread+0xfc/0x100 [ 689.302139] ret_from_fork+0x10/0x20 [ 689.302143] } [ 689.302144] ... key at: [<ffffd2e4d65f8690>] __key.1+0x0/0x10 [ 689.302150] ... acquired at: [ 689.302151] __lock_acquire+0x374/0x1a10 [ 689.302155] lock_acquire+0x11c/0x324 [ 689.302159] _raw_spin_lock+0x48/0x60 [ 689.302163] mtk_pcie_suspend_noirq+0xd0/0x14c [ 689.302168] dpm_run_callback+0x34/0x9c [ 689.302172] __device_suspend_noirq+0x68/0x1f0 [ 689.302177] dpm_noirq_suspend_devices+0x1d8/0x274 [ 689.302181] dpm_suspend_noirq+0x24/0x98 [ 689.302186] suspend_devices_and_enter+0x3b0/0x65c [ 689.302191] pm_suspend+0x1fc/0x330 [ 689.302197] state_store+0x80/0xec [ 689.302202] kobj_attr_store+0x18/0x2c [ 689.302205] sysfs_kf_write+0x4c/0x78 [ 689.302209] kernfs_fop_write_iter+0x120/0x1b0 [ 689.302213] vfs_write+0x224/0x340 [ 689.302218] ksys_write+0x6c/0x100 [ 689.302224] __arm64_sys_write+0x1c/0x28 [ 689.302230] invoke_syscall+0x48/0x114 [ 689.302235] el0_svc_common.constprop.0+0x40/0xe0 [ 689.302241] do_el0_svc+0x1c/0x28 [ 689.302246] el0_svc+0x50/0x108 [ 689.302252] el0t_64_sync_handler+0x100/0x12c [ 689.302257] el0t_64_sync+0x1a4/0x1a8 [ 689.302261] stack backtrace: [ 689.302264] CPU: 4 PID: 4626 Comm: systemd-sleep Not tainted 6.8.0-rc4-cos-mt9 #1 [ 689.302268] Hardware name: Google Spherion (rev0 - 3) (DT) [ 689.302271] Call trace: [ 689.302272] dump_backtrace+0x98/0xf0 [ 689.302276] show_stack+0x18/0x24 [ 689.302279] dump_stack_lvl+0xb8/0x118 [ 689.302285] dump_stack+0x18/0x24 [ 689.302290] print_irq_inversion_bug.part.0+0x1ec/0x27c [ 689.302294] mark_lock+0x300/0x720 [ 689.302298] __lock_acquire+0x374/0x1a10 [ 689.302301] lock_acquire+0x11c/0x324 [ 689.302305] _raw_spin_lock+0x48/0x60 [ 689.302310] mtk_pcie_suspend_noirq+0xd0/0x14c [ 689.302315] dpm_run_callback+0x34/0x9c [ 689.302319] __device_suspend_noirq+0x68/0x1f0 [ 689.302324] dpm_noirq_suspend_devices+0x1d8/0x274 [ 689.302328] dpm_suspend_noirq+0x24/0x98 [ 689.302332] suspend_devices_and_enter+0x3b0/0x65c [ 689.302338] pm_suspend+0x1fc/0x330 [ 689.302343] state_store+0x80/0xec [ 689.302348] kobj_attr_store+0x18/0x2c [ 689.302352] sysfs_kf_write+0x4c/0x78 [ 689.302355] kernfs_fop_write_iter+0x120/0x1b0 [ 689.302359] vfs_write+0x224/0x340 [ 689.302364] ksys_write+0x6c/0x100 [ 689.302370] __arm64_sys_write+0x1c/0x28 [ 689.302376] invoke_syscall+0x48/0x114 [ 689.302381] el0_svc_common.constprop.0+0x40/0xe0 [ 689.302387] do_el0_svc+0x1c/0x28 [ 689.302392] el0_svc+0x50/0x108 [ 689.302398] el0t_64_sync_handler+0x100/0x12c [ 689.302403] el0t_64_sync+0x1a4/0x1a8 [ 689.305522] Disabling non-boot CPUs ... [ 689.309813] IRQ273: set affinity failed(-22). [ 689.309868] IRQ288: set affinity failed(-22). [ 689.309993] psci: CPU1 killed (polled 0 ms) [ 689.314674] IRQ273: set affinity failed(-22). [ 689.314725] IRQ288: set affinity failed(-22). [ 689.314832] psci: CPU2 killed (polled 0 ms) [ 689.318738] IRQ273: set affinity failed(-22). [ 689.318781] IRQ288: set affinity failed(-22). [ 689.318904] psci: CPU3 killed (polled 0 ms) [ 689.320721] IRQ273: set affinity failed(-22). [ 689.320733] IRQ288: set affinity failed(-22). [ 689.320809] psci: CPU4 killed (polled 0 ms) [ 689.322718] IRQ273: set affinity failed(-22). [ 689.322738] IRQ288: set affinity failed(-22). [ 689.322852] psci: CPU5 killed (polled 0 ms) [ 689.326646] psci: CPU6 killed (polled 0 ms) [ 689.328620] psci: CPU7 killed (polled 0 ms) [ 689.331564] Enabling non-boot CPUs ... [ 689.332951] Detected VIPT I-cache on CPU1 [ 689.333012] GICv3: CPU1: found redistributor 100 region 0:0x000000000c060000 [ 689.333079] CPU1: Booted secondary processor 0x0000000100 [0x412fd050] [ 689.335992] CPU1 is up [ 689.336956] Detected VIPT I-cache on CPU2 [ 689.337006] GICv3: CPU2: found redistributor 200 region 0:0x000000000c080000 [ 689.337063] CPU2: Booted secondary processor 0x0000000200 [0x412fd050] [ 689.339991] CPU2 is up [ 689.341081] Detected VIPT I-cache on CPU3 [ 689.341127] GICv3: CPU3: found redistributor 300 region 0:0x000000000c0a0000 [ 689.341179] CPU3: Booted secondary processor 0x0000000300 [0x412fd050] [ 689.344328] CPU3 is up [ 689.345125] Detected PIPT I-cache on CPU4 [ 689.345146] GICv3: CPU4: found redistributor 400 region 0:0x000000000c0c0000 [ 689.345167] CPU4: Booted secondary processor 0x0000000400 [0x414fd0b0] [ 689.346438] CPU4 is up [ 689.347212] Detected PIPT I-cache on CPU5 [ 689.347231] GICv3: CPU5: found redistributor 500 region 0:0x000000000c0e0000 [ 689.347251] CPU5: Booted secondary processor 0x0000000500 [0x414fd0b0] [ 689.348409] CPU5 is up [ 689.349783] Detected PIPT I-cache on CPU6 [ 689.349803] GICv3: CPU6: found redistributor 600 region 0:0x000000000c100000 [ 689.349822] CPU6: Booted secondary processor 0x0000000600 [0x414fd0b0] [ 689.351051] CPU6 is up [ 689.351810] Detected PIPT I-cache on CPU7 [ 689.351830] GICv3: CPU7: found redistributor 700 region 0:0x000000000c120000 [ 689.351849] CPU7: Booted secondary processor 0x0000000700 [0x414fd0b0] [ 689.353144] CPU7 is up [ 691.492453] OOM killer enabled. [ 691.495588] Restarting tasks ... done. [ 691.501402] random: crng reseeded on system resumption [ 691.504932] PM: suspend exit September 20, 2023 at 4:22 AM, "AngeloGioacchino Del Regno" <angelogioacchino.delregno@xxxxxxxxxxxxx> wrote: > Il 20/09/23 03:28, Leonard Lausen ha scritto: > > > > > Dear AngeloGioacchino, Dear Maintainers, > > > > on MT8192 Asurada Spherion (Acer 514), I observe the following sleep hang, causing > a failure to suspend the system. The hang looks related to the deadlock you fixed > for MT8195 Tomato Chromebook in the past, thus I'm including you in the To: line. > The sleep hang happens with both v6.5.4 as well as v6.5.4 with > tags/mediatek-drm-next-6.6 merged in. (I'm unable to validate v6.6-rc2 currently, > as there's a regression breaking boot.) Please let me know if I can provide any > additional information to help debug this issue or if you have any ideas how to > address this bug. > > > > Looks like the issue is not related to the PCIe driver, but to something else not > > going to sleep (and the ChromeOS EC detecting that, and preventing the kernel to > > believe that the platform actually went to sleep). > > I'll try to check what's going on with MT8192 sleep ASAP. > > Anyway, as for the v6.6-rc2 regression, that's happening on MT8195 as well.. and > > I have solved it with this commit: > > https://lore.kernel.org/lkml/20230918150043.403250-1-angelogioacchino.delregno@xxxxxxxxxxxxx/ > > Regards, > > Angelo