Hi, Thorsten here, the Linux kernel's regression tracker. Johannes, Felix, Lorenzo, Ryder, I noticed a report about a regression in bugzilla.kernel.org that (for my untrained eyes) appears to be a bug in some code paths of mt76x2u that was exposed by 0d9c2beed116e6 ("wifi: mac80211: fix monitor channel with chanctx emulation") [v6.10-rc5, v6.9.7] from Johannes. As many (most?) kernel developers don't keep an eye on the bug tracker, I decided to write this mail. To quote from https://bugzilla.kernel.org/show_bug.cgi?id=219086 : > Michael 2024-07-23 15:38:43 UTC > > After a user opened this discussion: > https://github.com/ZerBea/hcxdumptool/discussions/465 > > Jul 21 05:40:39 rpi4b-aarch kernel: mt76x2u 2-2:1.0 wlan1: entered promiscuous mode > Jul 21 05:40:45 rpi4b-aarch kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > Jul 21 05:40:45 rpi4b-aarch kernel: Mem abort info: > Jul 21 05:40:45 rpi4b-aarch kernel: ESR = 0x0000000096000044 > Jul 21 05:40:45 rpi4b-aarch kernel: EC = 0x25: DABT (current EL), IL = 32 bits > Jul 21 05:40:45 rpi4b-aarch kernel: SET = 0, FnV = 0 > Jul 21 05:40:45 rpi4b-aarch kernel: EA = 0, S1PTW = 0 > Jul 21 05:40:45 rpi4b-aarch kernel: FSC = 0x04: level 0 translation fault > Jul 21 05:40:45 rpi4b-aarch kernel: Data abort info: > Jul 21 05:40:45 rpi4b-aarch kernel: ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 > Jul 21 05:40:45 rpi4b-aarch kernel: CM = 0, WnR = 1, TnD = 0, TagAccess = 0 > Jul 21 05:40:45 rpi4b-aarch kernel: GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > Jul 21 05:40:45 rpi4b-aarch kernel: user pgtable: 4k pages, 48-bit VAs, pgdp=0000000041300000 > Jul 21 05:40:45 rpi4b-aarch kernel: [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 > Jul 21 05:40:45 rpi4b-aarch kernel: Internal error: Oops: 0000000096000044 [#1] PREEMPT SMP > > I decided to run a test (AMD RYZEN & Arch Linux) on kernel 6.9.10 and 6.10 which confirmed the problem: > Trying to inject a 802.11 packet caused my AMD systems to become unresponsive. > I don't have a dmesg log, because my entire system crashed - need to power off! > > To reproduce on kernel 6.9.5 up to 6.10: > plug in an ALFA AWUS036ACM (mt76x2u) > set monitor mode > set WiFi channel and inject a packet > $ sudo hcxdumptool -i wlp5s0f4u2 --rcascan=active > or > sudo ./aireplay-ng --test wlp5s0f4u2 > > Kernel 6.6.40 is not affected and the user reported that kernel 6.8.2 is not affected, too. > That looks like a regression and git bisect identified the commit that caused the problem: > > commit 0d9c2beed116e623ac30810d382bd67163650f98 > Author: Johannes Berg <johannes.berg@xxxxxxxxx> > Date: Wed Jun 12 12:23:51 2024 +0200 > > wifi: mac80211: fix monitor channel with chanctx emulation > > After the channel context emulation, there were reports that > changing the monitor channel no longer works. This is because > those drivers don't have WANT_MONITOR_VIF, so the setting the > channel always exits out quickly. > > Fix this by always allocating the virtual monitor sdata, and > simply not telling the driver about it unless it wanted to. > This way, we have an interface/sdata to bind the chanctx to, > and the emulation can work correctly. > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: 0a44dfc07074 ("wifi: mac80211: simplify non-chanctx drivers") > Reported-and-tested-by: Savyasaachi Vanga <savyasaachiv@xxxxxxxxx> > Closes: https://lore.kernel.org/r/chwoymvpzwtbmzryrlitpwmta5j6mtndocxsyqvdyikqu63lon@gfds653hkknl > Link: https://msgid.link/20240612122351.b12d4a109dde.I1831a44417faaab92bea1071209abbe4efbe3fba@changeid > Signed-off-by: Johannes Berg <johannes.berg@xxxxxxxxx> > > net/mac80211/driver-ops.c | 17 +++++++++++++++++ > net/mac80211/iface.c | 21 +++++++++------------ > net/mac80211/util.c | 2 +- > 3 files changed, 27 insertions(+), 13 deletions(-) > > Looks like the patch which should fix monitor mode breaks mt76x2u driver. > > BTW: > Reasons for me to set severity to high: > "Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000" > and > running a simple command from which I would not have expected that my entire system crashes. > > [tag] [reply] [−] > Private > Comment 1 Michael 2024-07-23 17:17:13 UTC > > After some more tests, I'm not longer sure that the problem is caused by the commit mentioned. It looks like it is only a symptom. > I tested several mt76 devices e.g. this one: > D 148f:761a Ralink Technology, Corp. MT7610U ("Archer T2U" 2.4G+5G WLAN Adapter > > Driver is mt76x0u: > $ hcxdumptool -l > 0 3 503eaa1a736c f49da7d6f202 * wlp48s0f4u2u4 mt76x0u NETLINK > > All of them are running into the same problem as mentioned above, > while other devices are working as expected, e.g.: > ID 2357:010c TP-Link TL-WN722N v2/v3 [Realtek RTL8188EUS] > > Driver is rtl8xxxu > $ hcxdumptool -l > 0 3 9ca2f4094fe1 c8aacc8562e3 + wlp48s0f4u2u4 rtl8xxxu NETLINK > > This leads me to the assumption that the "chanctx emulation" inside the mt76 series driver caused the real problem. The reporter is CCed. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. P.S.: let me use this mail to also add the report to the list of tracked regressions to ensure it's doesn't fall through the cracks: #regzbot introduced: 0d9c2beed116e623ac30810d382bd67163650f98 #regzbot title: net: mt76x2u: NULL pointer dereference since recent change to fix chanctx emulation for monitor mode #regzbot from: Michael <ZeroBeat@xxxxxx> #regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=219086 #regzbot ignore-activity