Am Mittwoch, dem 17.07.2024 um 17:25 +0200 schrieb Felix Fietkau: > On 17.07.24 16:38, Bert Karwatzki wrote: > > Am Freitag, dem 12.07.2024 um 13:06 +0200 schrieb Bert Karwatzki: > > > Am Donnerstag, dem 11.07.2024 um 18:40 -0500 schrieb Sean Wang: > > > > Hi Bert, > > > > > > > > Thanks for the detailed debug log. I've quickly made a change to fix > > > > the issue. Right now, I can't access the test environment, but I'll > > > > test it and send it out as soon as possible. Here's the patch. > > > > > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > > > b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > > > index 2e6268cb06c0..1bab93d049df 100644 > > > > --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > > > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > > > @@ -303,6 +303,7 @@ mt7921_add_interface(struct ieee80211_hw *hw, > > > > struct ieee80211_vif *vif) > > > > > > > > mvif->bss_conf.mt76.omac_idx = mvif->bss_conf.mt76.idx; > > > > mvif->phy = phy; > > > > + mvif->bss_conf.vif = mvif; > > > > mvif->bss_conf.mt76.band_idx = 0; > > > > mvif->bss_conf.mt76.wmm_idx = mvif->bss_conf.mt76.idx % > > > > MT76_CONNAC_MAX_WMM_SETS; > > > > > > > > > > I wrote earlier that this patch works fine with linux-next-20240711 and at first > > > it did, but then another NULL pointer error occured. I'm not sure if I can > > > bisect this as it does not trigger automatically it seems. Also I'm currently > > > bisecting the problem with linux-20240712 > > > > > > Bert Karwatzki > > > > Now the above mentioned NULL pointer dereference has happended again, this time > > on linux-next-20240716. It cann be triggered by repeatedly turning the Wifi off > > and on again. For further investigation I created this patch: > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > index 2e6268cb06c0..3ecedf7bc9f3 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > @@ -303,6 +303,8 @@ mt7921_add_interface(struct ieee80211_hw *hw, struct > > ieee80211_vif *vif) > > > > mvif->bss_conf.mt76.omac_idx = mvif->bss_conf.mt76.idx; > > mvif->phy = phy; > > + WARN(!phy, "%s: phy = NULL\n", __func__); > > + mvif->bss_conf.vif = mvif; > > mvif->bss_conf.mt76.band_idx = 0; > > mvif->bss_conf.mt76.wmm_idx = mvif->bss_conf.mt76.idx % > > MT76_CONNAC_MAX_WMM_SETS; > > > > @@ -1182,6 +1184,12 @@ static void mt7921_ipv6_addr_change(struct ieee80211_hw > > *hw, > > struct inet6_dev *idev) > > { > > struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; > > + printk(KERN_INFO "%s: mvif = %px\n", __func__, mvif); > > + printk(KERN_INFO "%s: mvif->phy = %px\n", __func__, mvif->phy); > > + if (!mvif->phy) { > > + printk(KERN_INFO "%s: mvif->phy = NULL\n", __func__); > > + return; > > + } > > struct mt792x_dev *dev = mvif->phy->dev; > > struct inet6_ifaddr *ifa; > > struct in6_addr ns_addrs[IEEE80211_BSS_ARP_ADDR_LIST_LEN]; > > > > And the result is this (the WARN in mt7921_add_interface did not trigger): > > > > [ 367.121740] [ T861] wlp4s0: deauthenticating from 54:67:51:3d:a2:d2 by > > local choice (Reason: 3=DEAUTH_LEAVING) > > [ 367.209603] [ T861] mt7921_ipv6_addr_change: mvif = ffff954e7500de40 > > [ 367.209615] [ T861] mt7921_ipv6_addr_change: mvif->phy = 0000000000000000 > > [ 367.209621] [ T861] mt7921_ipv6_addr_change: mvif->phy = NULL > > [ 367.250026] [ T861] mt7921_ipv6_addr_change: mvif = ffff954e7500de40 > > [ 367.250034] [ T861] mt7921_ipv6_addr_change: mvif->phy = ffff954e44427768 > > [ 367.251537] [ T861] mt7921_ipv6_addr_change: mvif = ffff954e7500de40 > > [ 367.251542] [ T861] mt7921_ipv6_addr_change: mvif->phy = ffff954e44427768 > > [ 369.977123] [ T862] wlp4s0: authenticate with 54:67:51:3d:a2:d2 (local > > address=c8:94:02:c1:bd:69) > > [ 369.984864] [ T862] wlp4s0: send auth to 54:67:51:3d:a2:d2 (try 1/3) > > [ 370.006199] [ T104] wlp4s0: authenticated > > [ 370.006680] [ T104] wlp4s0: associate with 54:67:51:3d:a2:d2 (try 1/3) > > [ 370.059080] [ T98] wlp4s0: RX AssocResp from 54:67:51:3d:a2:d2 > > (capab=0x511 status=0 aid=2) > > [ 370.064067] [ T98] wlp4s0: associated > > > > So mvif->phy can be NULL at the start of mt7921_ipv6_addr_change. The early > > return in that case avoids the NULL pointer and mvif->phy has its usual value > > again on the next call to mt7921_ipv6_addr_change so Wifi is working again. I > > don't know how this could happen but perhaps you have an idea. > > This change should fix it: https://nbd.name/p/0747f54f > Please test. > > Thanks, > > - Felix > Your fix works. (I added a WARN() statement on the early return, to see if mvif- >phy actually was NULL during testing). Bert Karwatzki