On 2024-09-26 18:51, Bitterblue Smith wrote:
On 26/09/2024 16:04, petter@xxxxxxxxxx wrote:
On 2024-09-25 13:46, Bitterblue Smith wrote:
Hi,
I have this problem with RTL8811CU, RTL8723DU, RTL8811AU, RTL8812AU.
I assume all USB devices are affected. If I have qBittorrent running,
the wifi stops working after a few hours:
Sep 24 00:48:21 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:21 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:23 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:23 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:25 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:25 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:27 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:27 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:29 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:29 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:31 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:31 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:33 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:33 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:35 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:35 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:37 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:37 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:39 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:39 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:41 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-BEACON-LOSS
Sep 24 00:48:41 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:42 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2:
CTRL-EVENT-DISCONNECTED bssid=... reason=4 locally_generated=1
Sep 24 00:48:42 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2: Added
BSSID ... into ignore list, ignoring for 10 seconds
Sep 24 00:48:42 ideapad2 NetworkManager[433]: <info>
[1727128122.0377] device (wlp3s0f3u2i2): supplicant interface state:
completed -> disconnected
Sep 24 00:48:45 ideapad2 NetworkManager[433]: <info>
[1727128125.6030] device (wlp3s0f3u2i2): supplicant interface state:
disconnected -> scanning
Sep 24 00:48:47 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2: Removed
BSSID ... from ignore list (clear)
Sep 24 00:48:47 ideapad2 wpa_supplicant[1290]: wlp3s0f3u2i2: SME:
Trying to authenticate with ... (SSID='...' freq=2472 MHz)
Sep 24 00:48:50 ideapad2 kernel: wlp3s0f3u2i2: authenticate with ...
(local address=,,,)
Sep 24 00:48:51 ideapad2 NetworkManager[433]: <info>
[1727128131.2488] device (wlp3s0f3u2i2): supplicant interface state:
scanning -> authenticating
Sep 24 00:48:51 ideapad2 kernel: wlp3s0f3u2i2: send auth to ... (try
1/3)
Sep 24 00:48:51 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:52 ideapad2 kernel: wlp3s0f3u2i2: send auth to ... (try
2/3)
Sep 24 00:48:52 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:53 ideapad2 kernel: wlp3s0f3u2i2: send auth to ... (try
3/3)
Sep 24 00:48:53 ideapad2 kernel: rtw_8723du 1-2:1.2: failed to get tx
report from firmware
Sep 24 00:48:54 ideapad2 kernel: wlp3s0f3u2i2: authentication with
... timed out
After this all scans return nothing. The chip is still alive,
though. The LED blinks during the scans (it's hardware-controlled)
and another device in monitor mode can see the probe requests.
I confirmed that even C2H stop coming. I used aireplay-ng to send
some authentication or association frames (can't remember) which
require TX ACK report. I saw "failed to get tx report from firmware"
and no C2H.
While qBittorrent is needed to trigger this bug, simply downloading
a random Linux iso did not do the job. "Other" torrents did. It's
unclear why. Maybe it's uploading that triggers the bug.
I left iperf3 running all day and nothing happened. Only qBittorrent
can break it.
RTL8822CE doesn't have this problem. I can use qBittorrent with it
just fine.
I mounted debugfs and dumped the MAC registers during a scan using
this command:
for i in {00..20}; do sleep 0.5; cat
/sys/kernel/debug/ieee80211/phy2/rtw88/mac_{0..7} > dead-$i.txt; done
I thought maybe some RX URBs failed silently and rtw88 stopped
sending them to the device (== stopped requesting data from it),
but that's not the case. [1]
I have the device in this state right now. Is there anything else
I should look at?
What hardware are you running on? This looks very similar to some
issue me and some colleagues have seen from time-to-time when using
LM842 (8822cu)[1][2][3], when running it on our i.MX6SX arm board. It
has thou been harder and harder to trigger that issue on our board.
But the outcome when it happens is identical to your. In our case we
get it when running a number of mender streamed installations. We also
can trigger something similar when doing hw-offload scanning, so we
have disabled that in our setup. For us however it seems related to
slower platforms, we haven't seen it on systems with better
performance. Also it become a lot better when the USB RX aggregation
was added to the chip + running with the patch in [3]. We also got it
on LM808 (8812AU) then after suggestion we tried morrownr driver [4]
with USB aggregation enabled and couldn't trigger it anymore. But
feels like all these things are just ways to reduce the risk of
getting into this state. So I think you just
found yet another way to reproduce the behavior. So hopefully that is
the first step of finding the root cause of it. I will gladly help to
test things in this area if you guys find something interesting.
[1]
https://lore.kernel.org/all/20230526055551.1823094-1-petter@xxxxxxxxxx/t/
[2]
https://lore.kernel.org/linux-wireless/20230616122612.GL18491@xxxxxxxxxxxxxx/T/#t
[3]
https://lore.kernel.org/linux-wireless/20230612134048.321500-1-petter@xxxxxxxxxx/
[4] https://github.com/morrownr/8812au-20210820
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/wireless/realtek/rtw88/usb.c?h=v6.10.11&id=25eaef533bf3ccc6fee5067aac16f41f280e343e#n641
The hardware is a Lenovo Ideapad 3 15ADA6 with AMD Athlon Gold 3150U.
How does Mender handle the data transfers? Does it have something
in common with torrents?
In my case I guess they behaves a bit the same, meaning that mender will
sort of stream data, by downloading the os image in chunks and wire it
to disk. So both the torrent and mender will most likely stress the
network and system in a similar way by performing a lot of RX + disk I/O
which seems to make the driver behaves bad after some time.
For me it feels like it easier to trigger the issue with mender updates
when combining it with some tx traffic etc which I guess is happening
when you use qbitorrent..
I will see if I can reproduce the issue using bitorrent also in some
good way. Also after the usb aggregation changes I do not see issue with
"failed to get tx report" that frequent. Instead its more often stuck
with "firmware failed to leave lps state" but rest of the side-effects
as you describe is the same.
INFO[0000] Native sector size of block device /dev/mmcblkX is 512 bytes.
Mender will write in chunks of 1048576 bytes
.................[54407.626931] rtw_8822cu 1-1:1.2: firmware failed to
leave lps state
[54408.136328] Bluetooth: hci0: urb fb582009 failed to resubmit (2)
[54408.622588] wlxxxxxxx: deauthenticating from e5:65:d5:35:95:d5 by
local choice (Reason: 3=DEAUTH_LEAVING)
[54408.919367] rtw_8822cu 1-1:1.2: firmware failed to leave lps state