Hi Baochen, On 1/22/24 6:56 PM, Baochen Qiang wrote:
Currently ath11k_mac_op_unassign_vif_chanctx() deletes peer but ath11k_mac_op_assign_vif_chanctx() doesn't create it. This results in connection failure if MAC80211 calls drv_unassign_vif_chanctx() and drv_assign_vif_chanctx() during AUTH and ASSOC, see below log: [ 102.372431] wlan0: authenticated [ 102.372585] ath11k_pci 0000:01:00.0: wlan0: disabling HT/VHT/HE as WMM/QoS is not supported by the AP [ 102.372593] ath11k_pci 0000:01:00.0: mac chanctx unassign ptr ffff895084638598 vdev_id 0 [ 102.372808] ath11k_pci 0000:01:00.0: WMI vdev stop id 0x0 [ 102.383114] ath11k_pci 0000:01:00.0: vdev stopped for vdev id 0 [ 102.384689] ath11k_pci 0000:01:00.0: WMI peer delete vdev_id 0 peer_addr 20:e5:2a:21:c4:51 [ 102.396676] ath11k_pci 0000:01:00.0: htt peer unmap vdev 0 peer 20:e5:2a:21:c4:51 id 3 [ 102.396711] ath11k_pci 0000:01:00.0: peer delete resp for vdev id 0 addr 20:e5:2a:21:c4:51 [ 102.396722] ath11k_pci 0000:01:00.0: mac removed peer 20:e5:2a:21:c4:51 vdev 0 after vdev stop [ 102.396780] ath11k_pci 0000:01:00.0: mac chanctx assign ptr ffff895084639c18 vdev_id 0 [ 102.400628] wlan0: associate with 20:e5:2a:21:c4:51 (try 1/3) [ 102.508864] wlan0: associate with 20:e5:2a:21:c4:51 (try 2/3) [ 102.612815] wlan0: associate with 20:e5:2a:21:c4:51 (try 3/3) [ 102.720846] wlan0: association with 20:e5:2a:21:c4:51 timed out The peer delete logic in ath11k_mac_op_unassign_vif_chanctx() is introduced by commit b4a0f54156ac ("ath11k: move peer delete after vdev stop of station for QCA6390 and WCN6855") to fix firmware crash issue caused by unexpected vdev stop/peer delete sequence. Actually for a STA interface peer should be deleted in ath11k_mac_op_sta_state() when STA's state changes from IEEE80211_STA_NONE to IEEE80211_STA_NOTEXIST, which also coincides with current peer creation design that peer is created during IEEE80211_STA_NOTEXIST -> IEEE80211_STA_NONE transition. So move peer delete back to ath11k_mac_op_sta_state(), also stop vdev before deleting peer to fix the firmware crash issue mentioned there. In this way the connection failure mentioned here is also fixed. Also do some cleanups in patch "wifi: ath11k: remove invalid peer create logic", and refactor in patches "wifi: ath11k: rename ath11k_start_vdev_delay()" and "wifi: ath11k: avoid forward declaration of ath11k_mac_start_vdev_delay()". Tested this patch set using QCA6390 and WCN6855 on both STA and SAP interfaces. Basic connection and ping work well.
I wanted to let you know I'm seeing similar behavior in my own testing, with these patches applied. Granted I've hacked things up quite a bit but it appears to happen after a firmware crash. It may be related to my own changes of course, but I've connected and created/started a monitor vdev, then wait. At some point the firmware crashes likely due to my botched patches, but once it starts up again I see this same timeout behavior before a retry which connects successfully.
Baochen Qiang (4): wifi: ath11k: remove invalid peer create logic wifi: ath11k: rename ath11k_start_vdev_delay() wifi: ath11k: avoid forward declaration of ath11k_mac_start_vdev_delay() wifi: ath11k: fix connection failure due to unexpected peer delete drivers/net/wireless/ath/ath11k/mac.c | 564 +++++++++++++------------- 1 file changed, 288 insertions(+), 276 deletions(-) base-commit: 8ff464a183f92836d7fd99edceef50a89d8ea797