On 18/01/2025 18:58, John Glotzer wrote: > (Note: additional discussion has taken place under > Re: [REGRESSION] bluetooth: mt7921: Crash on Resume From Suspend And Hibernate) > > Hi, > > I have dug further into this issue and I think I have a root cause analysis that > makes sense (at least it does for me :) ). > The TLDR is that the root cause is the following commit that was introduced with the 6.11 kernel. > > https://github.com/torvalds/linux/commit/d53ab629cff57 > > Furthermore, the problem must be the call to usleep_range() in > drivers/net/wireless/mediatek/mt76/mt792x_core.c as this is the only behavioral change. > > Notice that this commit first shows up in v6.11-rc1 and is present for all subsequent releases, > which matches perfectly the breakage pattern seen by the user community. > > What, then, is the evidence for this? > > First of all the entire community has been unanimous in the observation that the issue > started with the 6.11 kernel. The universal experience has been that any kernel prior > to that had no issues, and all kernels starting with 6.11 were affected. Also no attempts > to mitigate the issue in code by attacking the problem via the firmware download code paths have > been fruitful. > > The next piece of solid data is outlined here: > > https://github.com/alimert-t/suspend-freeze-fix-for-mt7921e > > Here the lead paragraph states: > "A suspend/resume issue occurs on systems with the MediaTek MT7921 Wi-Fi adapter when > running on Kernel 6.11.-. After suspending, the system fails to resume / freezes and requires a hard > reset." > > The mitigation for this issue has consisted of one of two approches: > > - rfkill bluetooth and wifi on sleep and reverse the process on wake > - add the parameter mt7921e.disable_aspm=y to the kernel command line > (anecdotally I have seen reports of people doing things like turning off bluetooth > and/or wifi before suspending or for that matter rmmod mt7921e before suspending). > > I personally have used both of these methods with a sucess rate of 100%. > > The way to unlock the puzzle is to examine the 6.11 code with an eye towards > - what changed between v6.10 and v6.11? > - what is the intersection between this changeset and the disable_aspm paramter? > > To cut to the chase the answer to both these questions is just the contents of > https://github.com/torvalds/linux/commit/d53ab629cff57. I confirmed this by > diffing v6.10 and v6.11 and then going through the diff looking for disable_aspm. > > The following lines were added to drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > if (!mt7921_disable_aspm && mt76_pci_aspm_supported(pdev)) > dev->aspm_supported = true; > > The bitfield aspm_supported was added to the struct mt792x_dev in drivers/net/wireless/mediatek/mt76/mt792x.h > > and if this bitfield is true then the call to usleep_range is made in __mt792xe_mcu_drv_pmctrl() > in drivers/net/wireless/mediatek/mt76/mt792x_core.c. > > if (dev->aspm_supported) > usleep_range(2000, 3000); > > By setting mt7921e.disable_aspm=y on the kernel command line, this code pathway is avoided > and no crash or lockup happens when the device is woken back up. > > Disclaimers: > > - I don't claim to know the root cause for why the call to usleep_range() leads to a crash or a > freeze. > > - I don't know the details of the specific issue the code for commit d53ab629cff57 was designed > to fix, hence I don't know the consequences of removing the call to usleep_range(). However, > I do know that the user experience has been significantly impacted negatively by the introduction > of d53ab629cff57 into the 6.11 kernel. > > Thanks for your attention, > > John Glotzer > > John and Sergio, have y'all tried kernel 6.12.8 or newer? People say the suspend problem is fixed: https://bugzilla.kernel.org/show_bug.cgi?id=219514#c11