On Tue, Aug 16, 2016 at 03:21:22PM -0400, Kevin O'Connor wrote: > I've found that my wifi connection always seems to hang after a few > hours unless I dramatically increase "dot11RSNAConfigPMKLifetime". > What's the best way to further debug (and hopefully find the root > cause of this issue)? What's the down side to increasing this value? > > I'm using wpa_supplicant from git (31d3692f) on an openwrt based > router (tp-link archer c7 / ath10k). The 5ghz radio in the router is > setup as a client using WPA2-EAP / TTLS / MSCHAPv2. I don't have > admin access to the AP, but it appears to be a "Ruckus Wireless > ZoneFlex 802.11ac wave 2 4x4 access point". Did you happen to figure out how to resolve this issue or find any more details on the issue? Based on the information here, it sounds like there is some kind of interop issue with the specific AP used here for the case where EAP reauthentication is forced by the client. As far as I can tell, this works correctly in wpa_supplicant and works fine with tested against hostapd-based AP/Authenticator. > I enabled debugging in wpa_supplicant (-dd -f /var/log/wpalog) and > found that every 2 hours the AP seems to initiate a re-key event (mac > addresses removed): > Aug 07 19:15:48 wlan0: WPA: Group rekeying completed with (...mac...) [GTK=CCMP] This is group (GTK) rekeying. > this AP initiated re-keying is always successful. However, every 8.4 > hours it seems as if the client initiates a re-key event which looks > like: > > Aug 07 20:15:52 EAPOL: txStart While this one if full EAP reauthentication which is supposed to be followed by 4-way handshake to derive a new pairwise (unicast) key PTK. > Aug 07 20:15:55 wlan0: WPA: Key negotiation completed with (...mac...) [PTK=CCMP > +GTK=CCMP] And it sounds like this was completed successfully as far as wpa_supplicant is concerned. > After this event the connection always goes into a "dead" state. (OS > reports interface up, but no packets come through.) This "re-key > event" seems much more intensive (the logs are much bigger - eg, it > completes a x509 certificate check), but I see no indication of any > failure or error messages. This is indeed much more than the AP-triggered GTK rekeying. Not seeing packets go through after this (successful looking EAP reauth + 4 way handshake) would imply that there is some kind of mismatch in the encryption keys between the AP and station. It can be a bit tricky to debug this, though, since it would likely take having either access to debugging on both ends or alternatively, getting a full sniffer capture with known keys so that the encrypted frames can be decrypted to analyze what exactly happened and whether the AP or STA is using incorrect keys to encrypt frames after EAP reauthentication. > The connection seems to stay in this dead state until the next AP > initiated re-key event (often an hour or so later) - at which time the > re-keying fails after several attempts and then the interface is reset > causing the connection to come back up. The pattern repeats itself > every 8.4 hours. Interestingly, after coming up the second time, the > log has lots of these messages: > > ... > Aug 08 05:51:53 EAPOL: EAP Session-Id not available > Aug 08 05:51:58 EAPOL: EAP Session-Id not available > Aug 08 05:52:04 EAPOL: EAP Session-Id not available > ... > > However, these messages don't seem to adversely impact the connection > or change the pattern above. Yeah, those can be ignored since EAP Session-Id is not needed here. This "fixing" of the issue is due to the AP initiating GTK rekeying and failing to complete it with this STA and consequently forcing disconnection. That failure to rekey is expected here since the data connection was broken previously. > As above, I can work around the problem by increasing > dot11RSNAConfigPMKLifetime in the config file. I also tried setting > "fast_reauth=0" but that did not have an impact. With > "dot11RSNAConfigPMKLifetime=31536000" I've seen a solid connection for > multiple days. > > Any ideas on how I can further debug/fix this? Some notes above on what this would take.. Either debug from AP or sniffer capture and all the needed keys for analysis. Using a larger dot11RSNAConfigPMKLifetime value sounds like a reasonable workaround for this, though. All it does here is give the AP full control on when to force PMK rekeying (i.e., in practice, when to force EAP reauthentication). -- Jouni Malinen PGP id EFC895FA _______________________________________________ Hostap mailing list Hostap@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/hostap