Re: [regression] mhi: ath11k resume fails on some devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kalle,

On Thu, 16 Sept 2021 at 10:00, Kalle Valo <kvalo@xxxxxxxxxxxxxx> wrote:
>
> Hi Loic and Mani,
>
> I hate to be the bearer of bad news again :)
>
> I noticed already a while ago that commit 020d3b26c07a ("bus: mhi: Early
> MHI resume failure in non M3 state"), introduced in v5.13-rc1, broke
> ath11k resume on my NUC x86 testbox using QCA6390. Interestingly enough
> Dell XPS 13 9310 laptop (with QCA6390 as well) does not have this
> problem, I only see the problem on the NUC. I do not know what's causing
> this difference.

I suppose the NUC is current PCI-Express power during suspend while
the laptop maintains PCIe/M2 power.

>
> At the moment I'm running my tests with commit 020d3b26c07a reverted and
> everything works without problems. Is there a simple way to fix this? Or
> maybe we should just revert the commit? Commit log and kernel logs from
> a failing case below.

Do you have log of success case?

To me, the device loses power, that is why MHI resuming is failing.
Normally the device should be properly recovered/reinitialized. Before
that patch the power loss was simply not detected (or handled at
higher stack level).

Regards,
Loic


>
> Kalle
>
> commit 020d3b26c07abe274ac17f64999bbd3bf3342195
> Author:     Loic Poulain <loic.poulain@xxxxxxxxxx>
> AuthorDate: Fri Mar 5 17:14:01 2021 +0100
> Commit:     Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> CommitDate: Wed Mar 10 20:11:22 2021 +0530
>
>     bus: mhi: Early MHI resume failure in non M3 state
>
>     MHI suspend/resume are symmetric and balanced procedures. If device is
>     not in M3 state on a resume, that means something happened behind our
>     back. In this case resume is aborted and error reported, to let the
>     controller handle the situation.
>
>     This is mainly requested for system wide suspend-resume operation in
>     PCI context which may lead to power-down/reset of the controller which
>     will then lose its MHI context. In such cases, PCI driver is supposed
>     to recover and reinitialize the device.
>
>     Signed-off-by: Loic Poulain <loic.poulain@xxxxxxxxxx>
>     Reviewed-by: Bhaumik Bhatt <bbhatt@xxxxxxxxxxxxxx>
>     Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
>     Link: https://lore.kernel.org/r/1614960841-20233-1-git-send-email-loic.poulain@xxxxxxxxxx
>     Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
>
> [  267.182376] ACPI: PM: Waking up from system sleep state S3
> [  268.192783] ACPI: EC: interrupt unblocked
> [  268.193023] pcieport 0000:00:1c.4: Intel SPT PCH root port ACS workaround enabled
> [  268.204389] pcieport 0000:00:1d.0: Intel SPT PCH root port ACS workaround enabled
> [  268.204391] pcieport 0000:00:1c.1: Intel SPT PCH root port ACS workaround enabled
> [  268.205227] pcieport 0000:00:1c.2: Intel SPT PCH root port ACS workaround enabled
> [  269.360336] ACPI: EC: event unblocked
> [  269.367187] usb usb3: root hub lost power or was reset
> [  269.367215] usb usb4: root hub lost power or was reset
> [  269.368584] ath11k_pci 0000:06:00.0: failed to set mhi state: RESUME(6)
> [  269.455966] nvme nvme0: 8/0/0 default/read/poll queues
> [  272.289737] igb 0000:05:00.0 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> [  272.424084] ath11k_pci 0000:06:00.0: timed out while waiting for wow wakeup completion
> [  272.424091] ath11k_pci 0000:06:00.0: failed to wakeup wow during resume: -110
> [  272.424096] ath11k_pci 0000:06:00.0: failed to resume core: -110
> [  272.424101] PM: dpm_run_callback(): pci_pm_resume+0x0/0x2d0 returns -110
> [  272.424119] ath11k_pci 0000:06:00.0: PM: failed to resume async: error -110
> [  275.432003] ath11k_pci 0000:06:00.0: wmi command 16387 timeout
> [  275.432034] ath11k_pci 0000:06:00.0: failed to send WMI_PDEV_SET_PARAM cmd
> [  275.432088] ath11k_pci 0000:06:00.0: failed to enable PMF QOS: (-11
> [  275.432094] ------------[ cut here ]------------
> [  275.432114] Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue.
> [  275.432144] WARNING: CPU: 3 PID: 3164 at net/mac80211/util.c:2361 ieee80211_reconfig+0x216/0x22a0 [mac80211]
> [  275.432225] Modules linked in: ath11k_pci ath11k mac80211 libarc4 cfg80211 qmi_helpers qrtr_mhi mhi qrtr ns mos7840 usbserial nvme nvme_core [last unloaded: mhi]
> [  275.432287] CPU: 3 PID: 3164 Comm: kworker/u16:20 Not tainted 5.15.0-rc1 #483
> [  275.432293] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
> [  275.432298] Workqueue: events_unbound async_run_entry_fn
> [  275.432307] RIP: 0010:ieee80211_reconfig+0x216/0x22a0 [mac80211]
> [  275.432381] Code: c0 0f 85 4b 1f 00 00 41 c6 87 7c 08 00 00 00 4c 89 ff e8 ed 41 f1 ff 41 89 c5 85 c0 74 13 48 c7 c7 40 bc 7e c0 e8 ef 63 07 e4 <0f> 0b e9 12 ff ff ff 88 5c 24 37 49 8d 47 40 48 89 c2 48 89 44 24
> [  275.432386] RSP: 0000:ffffc90002bc7ab0 EFLAGS: 00010286
> [  275.432394] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [  275.432399] RDX: 0000000000000027 RSI: 0000000000000004 RDI: fffff52000578f48
> [  275.432403] RBP: ffff88810890169a R08: 0000000000000001 R09: ffff888234fe581b
> [  275.432408] R10: ffffed10469fcb03 R11: 0000000000000001 R12: ffff88810890169e
> [  275.432412] R13: 00000000fffffff5 R14: 0000000000000000 R15: ffff888108900e20
> [  275.432417] FS:  0000000000000000(0000) GS:ffff888234e00000(0000) knlGS:0000000000000000
> [  275.432421] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  275.432426] CR2: 0000000000000000 CR3: 0000000101148002 CR4: 00000000003706e0
> [  275.432430] Call Trace:
> [  275.432443]  ? trace_rdev_return_int+0x1a0/0x1a0 [cfg80211]
> [  275.432515]  wiphy_resume+0x190/0x370 [cfg80211]
> [  275.432574]  ? trace_device_pm_callback_start+0x123/0x1b0
> [  275.432584]  ? trace_rdev_return_int+0x1a0/0x1a0 [cfg80211]
> [  275.432642]  dpm_run_callback+0xf4/0x1b0
> [  275.432650]  ? trace_device_pm_callback_end+0x1a0/0x1a0
> [  275.432658]  ? device_links_read_unlock+0x1b/0x30
> [  275.432665]  ? dpm_wait_for_superior+0x256/0x430
> [  275.432679]  device_resume+0x3d5/0x980
> [  275.432688]  ? dpm_run_callback+0x1b0/0x1b0
> [  275.432693]  ? lockdep_hardirqs_on_prepare.part.0+0x19a/0x350
> [  275.432701]  ? ktime_get+0x214/0x2f0
> [  275.432707]  ? trace_hardirqs_on+0x1c/0x120
> [  275.432715]  ? recalibrate_cpu_khz+0x10/0x10
> [  275.432726]  ? device_resume+0x980/0x980
> [  275.432732]  async_resume+0x14/0x30
> [  275.432738]  async_run_entry_fn+0x90/0x4f0
> [  275.432750]  process_one_work+0x866/0x1460
> [  275.432768]  ? pwq_dec_nr_in_flight+0x230/0x230
> [  275.432787]  ? worker_thread+0x152/0x1010
> [  275.432798]  worker_thread+0x596/0x1010
> [  275.432818]  ? process_one_work+0x1460/0x1460
> [  275.432828]  kthread+0x322/0x3e0
> [  275.432833]  ? _raw_spin_unlock_irq+0x1f/0x30
> [  275.432838]  ? set_kthread_struct+0x100/0x100
> [  275.432848]  ret_from_fork+0x22/0x30
> [  275.432872] irq event stamp: 977
> [  275.432876] hardirqs last  enabled at (985): [<ffffffffa2462c4b>] console_trylock_spinning+0x19b/0x1f0
> [  275.432882] hardirqs last disabled at (992): [<ffffffffa2462bfa>] console_trylock_spinning+0x14a/0x1f0
> [  275.432888] softirqs last  enabled at (402): [<ffffffffc095f878>] ath11k_htc_send+0x668/0xc10 [ath11k]
> [  275.432914] softirqs last disabled at (400): [<ffffffffc095f797>] ath11k_htc_send+0x587/0xc10 [ath11k]
> [  275.432937] ---[ end trace 88fd8120acef327c ]---
> [  275.433884] ------------[ cut here ]------------
> [  275.433888] wlan0: Failed check-sdata-in-driver check, flags: 0x4
> [  275.433917] WARNING: CPU: 3 PID: 3164 at net/mac80211/driver-ops.c:97 drv_remove_interface+0x2cb/0x330 [mac80211]
> [  275.434008] Modules linked in: ath11k_pci ath11k mac80211 libarc4 cfg80211 qmi_helpers qrtr_mhi mhi qrtr ns mos7840 usbserial nvme nvme_core [last unloaded: mhi]
> [  275.434068] CPU: 3 PID: 3164 Comm: kworker/u16:20 Tainted: G        W         5.15.0-rc1 #483
> [  275.434074] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
> [  275.434079] Workqueue: events_unbound async_run_entry_fn
> [  275.434087] RIP: 0010:drv_remove_interface+0x2cb/0x330 [mac80211]
> [  275.434154] Code: c1 e9 03 80 3c 01 00 75 72 48 8b 83 88 06 00 00 48 8d b3 a8 06 00 00 48 c7 c7 60 2a 7e c0 48 85 c0 48 0f 45 f0 e8 8a 12 16 e4 <0f> 0b eb 90 e8 6c a8 23 e2 e9 e9 fd ff ff e8 62 a8 23 e2 e9 06 fe
> [  275.434159] RSP: 0000:ffffc90002bc7788 EFLAGS: 00010282
> [  275.434167] RAX: 0000000000000000 RBX: ffff8881735a0c40 RCX: 0000000000000000
> [  275.434171] RDX: 0000000000000027 RSI: 0000000000000004 RDI: fffff52000578ee3
> [  275.434176] RBP: ffff888108900e20 R08: 0000000000000001 R09: ffff888234fe581b
> [  275.434180] R10: ffffed10469fcb03 R11: 0000000000000001 R12: dffffc0000000000
> [  275.434184] R13: ffff8881735a12d8 R14: ffff888108901568 R15: 000000000000000f
> [  275.434189] FS:  0000000000000000(0000) GS:ffff888234e00000(0000) knlGS:0000000000000000
> [  275.434194] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  275.434198] CR2: 0000000000000000 CR3: 0000000101148002 CR4: 00000000003706e0
> [  275.434203] Call Trace:
> [  275.434212]  ieee80211_do_stop+0xe27/0x1a20 [mac80211]
> [  275.434291]  ? mutex_lock_io_nested+0x1490/0x1490
> [  275.434303]  ? ieee80211_del_virtual_monitor+0x1a0/0x1a0 [mac80211]
> [  275.434370]  ? mark_held_locks+0xa5/0xe0
> [  275.434382]  ? lockdep_hardirqs_on_prepare.part.0+0x19a/0x350
> [  275.434389]  ? __local_bh_enable_ip+0x9d/0xf0
> [  275.434394]  ? trace_hardirqs_on+0x1c/0x120
> [  275.434410]  ieee80211_stop+0xb2/0x230 [mac80211]
> [  275.434484]  __dev_close_many+0x191/0x2a0
> [  275.434491]  ? netif_tx_stop_all_queues+0xf0/0xf0
> [  275.434496]  ? find_held_lock+0x33/0x110
> [  275.434507]  ? __lock_release+0x494/0xa40
> [  275.434518]  dev_close_many+0x1c5/0x540
> [  275.434527]  ? wait_for_completion_io+0x280/0x280
> [  275.434535]  ? dev_get_by_napi_id+0x110/0x110
> [  275.434544]  ? wiphy_resume+0x1a5/0x370 [cfg80211]
> [  275.434610]  dev_close+0x132/0x1d0
> [  275.434617]  ? dev_xdp_attach.constprop.0+0x750/0x750
> [  275.434633]  cfg80211_shutdown_all_interfaces+0x71/0x180 [cfg80211]
> [  275.434697]  wiphy_resume+0x1b2/0x370 [cfg80211]
> [  275.434755]  ? trace_device_pm_callback_start+0x123/0x1b0
> [  275.434765]  ? trace_rdev_return_int+0x1a0/0x1a0 [cfg80211]
> [  275.434822]  dpm_run_callback+0xf4/0x1b0
> [  275.434830]  ? trace_device_pm_callback_end+0x1a0/0x1a0
> [  275.434839]  ? device_links_read_unlock+0x1b/0x30
> [  275.434845]  ? dpm_wait_for_superior+0x256/0x430
> [  275.434859]  device_resume+0x3d5/0x980
> [  275.434868]  ? dpm_run_callback+0x1b0/0x1b0
> [  275.434873]  ? lockdep_hardirqs_on_prepare.part.0+0x19a/0x350
> [  275.434880]  ? ktime_get+0x214/0x2f0
> [  275.434886]  ? trace_hardirqs_on+0x1c/0x120
> [  275.434893]  ? recalibrate_cpu_khz+0x10/0x10
> [  275.434904]  ? device_resume+0x980/0x980
> [  275.434910]  async_resume+0x14/0x30
> [  275.434916]  async_run_entry_fn+0x90/0x4f0
> [  275.434928]  process_one_work+0x866/0x1460
> [  275.434946]  ? pwq_dec_nr_in_flight+0x230/0x230
> [  275.434965]  ? worker_thread+0x152/0x1010
> [  275.434992]  worker_thread+0x596/0x1010
> [  275.435013]  ? process_one_work+0x1460/0x1460
> [  275.435022]  kthread+0x322/0x3e0
> [  275.435027]  ? _raw_spin_unlock_irq+0x1f/0x30
> [  275.435032]  ? set_kthread_struct+0x100/0x100
> [  275.435042]  ret_from_fork+0x22/0x30
> [  275.435065] irq event stamp: 1923
> [  275.435069] hardirqs last  enabled at (1931): [<ffffffffa2462c4b>] console_trylock_spinning+0x19b/0x1f0
> [  275.435076] hardirqs last disabled at (1938): [<ffffffffa2462bfa>] console_trylock_spinning+0x14a/0x1f0
> [  275.435082] softirqs last  enabled at (1290): [<ffffffffa4c0050a>] __do_softirq+0x50a/0x7d5
> [  275.435087] softirqs last disabled at (1283): [<ffffffffa2336935>] __irq_exit_rcu+0xe5/0x120
> [  275.435093] ---[ end trace 88fd8120acef327d ]---
> [  275.435126] ------------[ cut here ]------------
> [  275.435130] WARNING: CPU: 3 PID: 3164 at net/mac80211/driver-ops.c:36 drv_stop+0x290/0x310 [mac80211]
> [  275.435197] Modules linked in: ath11k_pci ath11k mac80211 libarc4 cfg80211 qmi_helpers qrtr_mhi mhi qrtr ns mos7840 usbserial nvme nvme_core [last unloaded: mhi]
> [  275.435256] CPU: 3 PID: 3164 Comm: kworker/u16:20 Tainted: G        W         5.15.0-rc1 #483
> [  275.435261] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601 08/01/2018
> [  275.435265] Workqueue: events_unbound async_run_entry_fn
> [  275.435274] RIP: 0010:drv_stop+0x290/0x310 [mac80211]
> [  275.435339] Code: 80 3d 5f f1 29 00 00 75 e2 48 c7 c2 c0 29 7e c0 be 34 01 00 00 48 c7 c7 20 2a 7e c0 c6 05 43 f1 29 00 01 e8 af 64 16 e4 eb c1 <0f> 0b 5b 5d 41 5c 41 5d c3 0f 0b e9 d3 fd ff ff 48 89 ef e8 18 b2
> [  275.435344] RSP: 0000:ffffc90002bc7790 EFLAGS: 00010246
> [  275.435352] RAX: 0000000000000000 RBX: ffff888108900e20 RCX: 0000000000000001
> [  275.435356] RDX: 0000000000000004 RSI: ffffffffa5a021a0 RDI: ffff888145778920
> [  275.435360] RBP: ffff88810890169c R08: 0000000000000001 R09: ffffc90002bc757f
> [  275.435365] R10: ffffc90002bc77a8 R11: 0000000000000001 R12: dffffc0000000000
> [  275.435369] R13: ffff888108900e20 R14: ffff888108901568 R15: 000000000000000f
> [  275.435373] FS:  0000000000000000(0000) GS:ffff888234e00000(0000) knlGS:0000000000000000
> [  275.435378] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  275.435382] CR2: 0000000000000000 CR3: 0000000101148002 CR4: 00000000003706e0
> [  275.435387] Call Trace:
> [  275.435394]  ieee80211_do_stop+0x11dd/0x1a20 [mac80211]
> [  275.435472]  ? mutex_lock_io_nested+0x1490/0x1490
> [  275.435484]  ? ieee80211_del_virtual_monitor+0x1a0/0x1a0 [mac80211]
> [  275.435551]  ? mark_held_locks+0xa5/0xe0
> [  275.435562]  ? lockdep_hardirqs_on_prepare.part.0+0x19a/0x350
> [  275.435569]  ? __local_bh_enable_ip+0x9d/0xf0
> [  275.435574]  ? trace_hardirqs_on+0x1c/0x120
> [  275.435590]  ieee80211_stop+0xb2/0x230 [mac80211]
> [  275.435663]  __dev_close_many+0x191/0x2a0
> [  275.435670]  ? netif_tx_stop_all_queues+0xf0/0xf0
> [  275.435675]  ? find_held_lock+0x33/0x110
> [  275.435686]  ? __lock_release+0x494/0xa40
> [  275.435697]  dev_close_many+0x1c5/0x540
> [  275.435706]  ? wait_for_completion_io+0x280/0x280
> [  275.435713]  ? dev_get_by_napi_id+0x110/0x110
> [  275.435723]  ? wiphy_resume+0x1a5/0x370 [cfg80211]
> [  275.435790]  dev_close+0x132/0x1d0
> [  275.435797]  ? dev_xdp_attach.constprop.0+0x750/0x750
> [  275.435813]  cfg80211_shutdown_all_interfaces+0x71/0x180 [cfg80211]
> [  275.435876]  wiphy_resume+0x1b2/0x370 [cfg80211]
> [  275.435935]  ? trace_device_pm_callback_start+0x123/0x1b0
> [  275.435944]  ? trace_rdev_return_int+0x1a0/0x1a0 [cfg80211]
> [  275.436018]  dpm_run_callback+0xf4/0x1b0
> [  275.436026]  ? trace_device_pm_callback_end+0x1a0/0x1a0
> [  275.436035]  ? device_links_read_unlock+0x1b/0x30
> [  275.436041]  ? dpm_wait_for_superior+0x256/0x430
> [  275.436055]  device_resume+0x3d5/0x980
> [  275.436064]  ? dpm_run_callback+0x1b0/0x1b0
> [  275.436069]  ? lockdep_hardirqs_on_prepare.part.0+0x19a/0x350
> [  275.436076]  ? ktime_get+0x214/0x2f0
> [  275.436082]  ? trace_hardirqs_on+0x1c/0x120
> [  275.436089]  ? recalibrate_cpu_khz+0x10/0x10
> [  275.436100]  ? device_resume+0x980/0x980
> [  275.436106]  async_resume+0x14/0x30
> [  275.436112]  async_run_entry_fn+0x90/0x4f0
> [  275.436124]  process_one_work+0x866/0x1460
> [  275.436142]  ? pwq_dec_nr_in_flight+0x230/0x230
> [  275.436161]  ? worker_thread+0x152/0x1010
> [  275.436172]  worker_thread+0x596/0x1010
> [  275.436191]  ? process_one_work+0x1460/0x1460
> [  275.436201]  kthread+0x322/0x3e0
> [  275.436206]  ? _raw_spin_unlock_irq+0x1f/0x30
> [  275.436211]  ? set_kthread_struct+0x100/0x100
> [  275.436221]  ret_from_fork+0x22/0x30
> [  275.436244] irq event stamp: 2619
> [  275.436248] hardirqs last  enabled at (2627): [<ffffffffa2462c4b>] console_trylock_spinning+0x19b/0x1f0
> [  275.436254] hardirqs last disabled at (2634): [<ffffffffa2462bfa>] console_trylock_spinning+0x14a/0x1f0
> [  275.436260] softirqs last  enabled at (1290): [<ffffffffa4c0050a>] __do_softirq+0x50a/0x7d5
> [  275.436266] softirqs last disabled at (1283): [<ffffffffa2336935>] __irq_exit_rcu+0xe5/0x120
> [  275.436271] ---[ end trace 88fd8120acef327e ]---
> [  275.438124] PM: dpm_run_callback(): wiphy_resume+0x0/0x370 [cfg80211] returns -11
> [  275.438194] ieee80211 phy0: PM: failed to resume async: error -11
>
> --
> https://patchwork.kernel.org/project/linux-wireless/list/
>
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux