Search Linux Wireless

Ping/IP network loss post-roam

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We have noticed a sporadic problem that seems to come and go. It was first noticed only on a specific AP manufacturer but more recently seen on another which is why I'm revisiting the problem.

The client is using ath10k QCA6174 hw 3.2 hardware. The network is WPA2, configured with FT. This has been seen both with over-Air and over-DS.

The client will always roam using FT without a problem. No indication of failed FT (ft-auth/ft-action and assoc are both successful). Sometimes though, after the roam, the client seems to lose all IP networking capabilities (pings, tcp/udp all fail).

IWD is getting zero indication there there is a problem after the roam. No packet/beacon loss CQM events, no deauths. On the AP side (if we even get any indication of a problem) we see "client not responding". It appears there is some disconnect between the clients state and the state the AP thinks the client is in. The client thinks its connected, the AP thinks the client has disappeared.

After noticing this problem a watchdog was implemented which starts pinging post-roam and if enough pings fail it triggers a deauth and authenticates again. This at least gets the client back on the network, but obviously isn't great because the client loses networking for an extended period waiting for pings to fail, then deauth/reauthing, doing DHCP etc. We hadn't gotten any traction trying to explain the issue to the AP vendor. Its always a client issue...

These are production devices and ath10k debugging is not built in to the module. All I have is kernel/IWD logs which just shows the roam was successful, and we deauthed later. Not particularly useful.

I'm trying to determine where the problem is, is it client side or infrastructure, and if there is anything that can be done either from an ath10k driver or supplicant (IWD) perspective. Getting ath10k logs is something I'd like to eventually do but its easier said than done. These devices are always running and customers generally don't want them messed with. I have ssh access so if there is any additional info I can get without kernel changes I'm happy to try.

Nov 30 13:33:11 kernel: wlan0: disconnect from AP xx:xx:xx:xx:xx:xx for new assoc to yy:yy:yy:yy:yy:yy
Nov 30 13:33:11 kernel: wlan0: associate with yy:yy:yy:yy:yy:yy (try 1/3)
Nov 30 13:33:11 kernel: wlan0: RX ReassocResp from yy:yy:yy:yy:yy:yy (capab=0x411 status=0 aid=6)
Nov 30 13:33:11 kernel: wlan0: associated
Nov 30 13:33:11 kernel: ath: EEPROM regdomain: 0x809c
Nov 30 13:33:11 kernel: ath: EEPROM indicates we should expect a country code
Nov 30 13:33:11 kernel: ath: doing EEPROM country->regdmn map search
Nov 30 13:33:11 kernel: ath: country maps to regdmn code: 0x52
Nov 30 13:33:11 kernel: ath: Country alpha2 being used: CN
Nov 30 13:33:11 kernel: ath: Regpair used: 0x52
Nov 30 13:33:11 kernel: ath: regdomain 0x809c dynamically updated by country element

# This condition is detected by watchdog, and we deauth

Nov 30 13:33:34 kernel: wlan0: deauthenticating from yy:yy:yy:yy:yy:yy by local choice (Reason: 3=DEAUTH_LEAVING)

# We then auth to the very same BSS, successfully and have no problems (until it happens sometime later)

Nov 30 13:33:36 kernel: wlan0: authenticate with yy:yy:yy:yy:yy:yy
Nov 30 13:33:36 kernel: wlan0: send auth to yy:yy:yy:yy:yy:yy (try 1/3)
Nov 30 13:33:36 kernel: wlan0: authenticated
Nov 30 13:33:36 kernel: wlan0: associate with yy:yy:yy:yy:yy:yy (try 1/3)
Nov 30 13:33:36 kernel: wlan0: RX AssocResp from yy:yy:yy:yy:yy:yy (capab=0x411 status=0 aid=6)
Nov 30 13:33:36 kernel: wlan0: associated

Thanks,

James





[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux