Search Linux Wireless

RE: [REGRESSION] Freeze on resume from S3 (bisected)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mathew,

Forty Five <mathewegeorge@xxxxxxxxx> wrote:
> 
> Ping-Ke Shih <pkshih@xxxxxxxxxxx> writes:
> 
> > Since I saw 'NetworkManager' and 'hostapd' in code trace, I would like to know
> > if you have two virtual interfaces, which for STA and AP modes? (Please check
> > this by 'iw dev') If so, is it possible to remove hostapd (AP mode) to see if
> > this is a factor causing crash.
> 
> I use hostapd as part of a Wi-Fi hotspot setup for this laptop. I REALLY
> wish I'd connected the dots earlier and realised that it could be
> related to this issue. While running gbcbefbd032 (first bad commit), I
> disabled all the components of my setup and the issue went away; then I
> enabled them one by one until the issue emerged. I'll walk you through
> the relevant details, and my observations during this process.
> 
> I create a virtual interface for hostapd using this systemd unit:
> 
> ```
> [Unit]
> Requires=sys-subsystem-net-devices-wlo1.device
> After=network.target
> After=sys-subsystem-net-devices-wlo1.device
> [Service]
> Type=oneshot
> ExecStart=/usr/bin/iw dev wlo1 interface add wlo1_ap type __ap addr "xx:xx:xx:xx:xx:xx"
> ExecStart=/usr/bin/ip addr add 192.168.30.1/24 dev wlo1_ap
> [Install]
> WantedBy=multi-user.target
> ```
> 
> I need the '__ap' type because my card doesn't support two interfaces in
> managed mode; see [1] for details.
> 
> [1] https://wiki.archlinux.org/title/Talk:Software_access_point#Two_interfaces_on_same_card
> 
> Then I configure NetworkManager to ignore this interface.
> 
> ```
> ;; in /etc/NetworkManager/conf.d/unmanaged.conf
> [keyfile]
> unmanaged-devices=interface-name:wlo1_ap
> ```
> 
> Coming to hostapd - this is where it gets rather complicated. First off,
> let me mention that when I enabled hostapd.service again, I started
> seeing the 'phy0: resume with hardware scan still in progress' warnings,
> which had gone away upto this point.
> 
> Next - once I enabled hostapd.service, I was able to reproduce the
> crashes. However, the dmesg in the crash log was different from what I
> see when I have the rest of my setup enabled (I hadn't applied either
> patch when this crash happened, and it's on b54846da4 because that's the
> earliest bad commit in which I'm able to produce crash logs at all, as I
> described in my original message):

Your setup is very complicated, so I can't setup in my side easily, and
haven't time to dig deeper. I feel there are more than one problems, so please
help to do some experiments to narrow down scope. 

First problem is the culprit commit [1] that makes system frozen, and I still
feel the patch [2] you have taken can fix it. Please use [1] as code base and
apply patch [2] to see the result (#exp 1). The difference between without [1] and
with [1] + [2] is the timing driver report scan abort completion to mac80211.
And the last few logs you collected show that crash after long time from
scanning abort.

Second problem is WiFi firmware get abnormal during doing resume. The log
looks like (partially):
	 [    T562] rtw89_8852be 0000:02:00.0: R_AX_RPQ_RXBD_IDX =0x00000000
	 [    T562] rtw89_8852be 0000:02:00.0: R_AX_DBG_ERR_FLAG=0x00000000
	 [    T562] rtw89_8852be 0000:02:00.0: R_AX_LBC_WATCHDOG=0x00000081
	 [    T562] rtw89_8852be 0000:02:00.0: <---
	 [    T562] rtw89_8852be 0000:02:00.0: SER catches error: 0x5000
In my side, this is rare, and your last few logs seem not happen. Not sure if
this is because of timing result from adding many logs. I would defer this
problem for now.

Third (unsure) problem could be introduced by commits between [1] and [3].
If first problem can be addressed by #exp 1, it could be possible to bisect
the problem between [1] and [3]. Even if [1] is the only problem, revert
the commit to see if it becomes good (#exp 2).

Summary: 

     o 5bbd9b249880 [3] (v6.10-rc5)
     |              #exp 2: 5bbd9b249880 + [4] (revert [1]; I feel this would be bad).
     :
     :
     :
     o bcbefbd032df [1] ("wifi: rtw89: add wait/completion for abort scan")
     |              #exp 1: bcbefbd032df + [2] (I think this will be good.)
     o 7e11a2966f51 (this commit is good)



[1] bcbefbd032df ("wifi: rtw89: add wait/completion for abort scan")
[2] fix scan abort https://lore.kernel.org/linux-wireless/20240517013350.11278-1-pkshih@xxxxxxxxxxx/
[3] 5bbd9b249880 (v6.10-rc5; the top of tree you are tring)
[4] attached revert patch of [1]

Ping-Ke

Attachment: 0001-Revert-wifi-rtw89-add-wait-completion-for-abort-scan.patch
Description: 0001-Revert-wifi-rtw89-add-wait-completion-for-abort-scan.patch


[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux