28.05.2021 01:47, Dmitry Osipenko пишет: > 27.05.2021 19:42, Arend van Spriel пишет: >> On 5/26/2021 5:10 PM, Dmitry Osipenko wrote: >>> Hello, >>> >>> After updating to Ubuntu 21.04 I found two problems related to the >>> BRCMF_C_GET_ASSOCLIST using an older BCM4329 SDIO WiFi. >>> >>> 1. The kernel is spammed with: >>> >>> ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST >>> unsupported, err=-52 >>> ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST >>> unsupported, err=-52 >>> ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST >>> unsupported, err=-52 >>> >>> Which happens apparently due to a newer NetworkManager version that >>> pokes dump_station() periodically. I sent [1] that fixes this noise. >>> >>> [1] >>> https://patchwork.kernel.org/project/linux-wireless/list/?series=480715 >> >> Right. I noticed this one and did not have anything to add to the >> review/suggestion. > > Please feel free to add yours r-b to the patches if they are good to you. > >>> 2. The other much worse problem is that WiFi eventually dies now with >>> these errors: >>> >>> ... >>> ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST >>> unsupported, err=-52 >>> brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout >>> ieee80211 phy0: brcmf_cfg80211_dump_station: BRCMF_C_GET_ASSOCLIST >>> unsupported, err=-110 >>> ieee80211 phy0: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg >>> failed w/status -110 >>> >>> From this point all firmware calls start to fail with err=-110 and >>> WiFi doesn't work anymore. This problem is reproducible with 5.13-rc >>> and current -next, I haven't checked older kernel versions. Somehow >>> it's worse using a recent -next, WiFi dies quicker. >>> >>> What's interesting is that I see that there is always a pending signal >>> in brcmf_sdio_dcmd_resp_wait() when timeout happens. It looks like the >>> timeout happens when there is access to a swap partition, which stalls >>> system for a second or two, but this is not 100%. Increasing >>> DCMD_RESP_TIMEOUT doesn't help. >> >> The timeout error (-110) can have two root causes that I am aware off. >> Either the firmware died or the SDIO layer has gone haywire. Not sure if >> that swap partition is on eMMC device, but if so it could be related. >> You could try generating device coredump. If that also gives -110 errors >> we know it is the SDIO layer. > > Coredump is a good idea, thank you. The swap partition is on external SD > card, everything else is on eMMC. > >>> Please let me know if you have any ideas of how to fix this trouble >>> properly or if you need need any more info. >>> >>> Removing BRCMF_C_GET_ASSOCLIST firmware call entirely from the driver >>> fixes the problem. >> >> My guess is that reducing interaction with firmware is what is avoiding >> the issue and not so much this specific firmware command. As always it >> is good to know the conditions in which the issue occurs. What is the >> hardware platform you are running Ubuntu on? Stuff like that. > > That's an older Acer A500 NVIDIA Tegra20 tablet device [1]. I may also > try to reproduce problem on Tegra30 Nexus 7 with BCM4330. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts > > Thank you very much for the suggestions. I will try to collect more info > and come back with the report. > I was testing this for the past weeks and the problem is not reproducible anymore. Apparently something got fixed in linux-next. I haven't tried to bisect the fix since it's a bit too painful to do. Still there are occasional -110 errors when system stalls on a memory swap, but WiFi keeps working now.