wpa_supplicant over-eagerly blacklisting AP sending PREV_AUTH_NOT_VALID?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[I originally posted this to the linux-wireless mailing list,
but it was pointed out to me that it should have gone here.
My apologies to anyone who's on both lists and therefore sees
it twice.]

I have some reason to believe the following:

  * Some Cisco wireless APs will regularly try to force clients
    to reauthenticate by sending deauthorization frames
    with reason code 2 (PREV_AUTH_NOT_VALID).

  * When one of these arrives, wpa_supplicant will respond by
    putting the AP on a blacklist and roaming to another AP
    rather than by immediately trying to reauthenticate with
    the same AP.

This is a Bad Thing (isn't it?) because e.g. if you have two of
these APs within range but one provides a much better signal
than the other, you'll alternate between them rather than
sticking with the good one. It seems like it might be better
for wpa_supplicant to try the original AP again immediately.

(The problem that actually sent me looking at this stuff is
more severe and involves machines completely falling off
the network after several of these transitions, but I think
that's a separate issue that I don't understand yet.)

Some details follow.

                               *

Background:

I have (more precisely, my employer has) a Linux machine sitting on
a wireless network. After a while (often at intervals slightly greater
than two hours) it falls off the network and will not automatically
reconnect. This is unfortunate for us because it's not our wireless
network but another company's, and the machine is meant to
do its thing without manual intervention.

After cranking up the logging verbosity of wpa_supplicant on the
affected machine and digging through the syslog, here is at least
part of what's going on.

  * There are multiple APs in range (and in some cases the same
    physical AP has multiple BSSIDs).

  * After being connected to any of them for half an hour,
    our machine gets kicked off, apparently because the AP
    sent a deauth frame with reason code 2
    (PREV_AUTH_NOT_VALID).

  * When this happens, the client's wpa_supplicant puts
    the AP on its blacklist and roams to a different AP.

  * Sometimes, after several of these, something goes
    wrong enough in our attempts to reconnect that the
    machine gives up completely. I think this is caused by
    an entirely different problem; I mention it only for
    context.

                               *

Evidence from the logs:

Here is a (lightly censored) portion of the syslog at the time of
one of those deauthorizations. I've censored the client's
hostname, the AP's BSSID, and the client's MAC address.
This is with wpa_supplicant -dd.

Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: RTM_NEWLINK: ifi_index=3 ifname=wlan0 operstate=2 linkmode=1 ifi_flags=0x1003 ([UP]) Jan 28 17:54:49 CENSORED-HOSTNAME kernel: [281557.856806] wlan0: deauthenticated from CENSORED-BSSID (Reason: 2=PREV_AUTH_NOT_VALID) Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: Event message available Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: Drv Event 20 (NL80211_CMD_DEL_STATION) received for wlan0 Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: Delete station CENSORED-BSSID Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: Event message available Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: Drv Event 39 (NL80211_CMD_DEAUTHENTICATE) received for wlan0 Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: MLME event 39 (NL80211_CMD_DEAUTHENTICATE) on wlan0(CENSORED-MAC-ADDR) A1=CENSORED-MAC-ADDR A2=CENSORED-BSSID Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: MLME event frame - hexdump(len=26): c0 00 30 00 xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx 30 ca 02 00 Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: nl80211: Deauthenticate event Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0: Event DEAUTH (12) received Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0: Deauthentication notification
Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0:  * reason 2
Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0: * address CENSORED-BSSID Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: Deauthentication frame IE(s) - hexdump(len=0): [NULL] Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0: CTRL-EVENT-DISCONNECTED bssid=CENSORED-BSSID reason=2 Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0: Auto connect enabled: try to reconnect (wps=0 wpa_state=9) Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: wlan0: Setting scan request: 0.100000 sec Jan 28 17:54:49 CENSORED-HOSTNAME wpa_supplicant[10876]: Added BSSID CENSORED-BSSID into blacklist

>From the first three octets of the BSSIDs it is apparent that
all the APs involved are Cisco ones. I know that at least some
Cisco APs have a session timeout defaulting to half an hour,
after which they will force reauthentication.

I believe the second message above (wlan0: deauthenticated ...)
is generated in function ieee80211_rx_mgmt_deauth in
net/mac80211/mlme.c, and it looks to me as if this indicates
that the AP sent us a deauth frame with reason code 2
("previous authentication no longer valid").

Then, near the end of the excerpt above, the message
"Auto connect enabled ..." comes from function
wpa_supplicant_event_disassoc_finish in wpa_supplicant;
that function has the ability to cause an immediate attempt
to reauthenticate with the same AP, but doesn't do so in
this case because it will only do it for reason codes 4,6,7
and this deauthorization had reason code 2.

                               *

I am not very expert in the ways of wireless networking;
perhaps these Cisco APs are behaving improperly, though
what they're doing seems reasonable enough to me. But
it seems to me that in this situation wpa_supplicant would
do better to try again immediately with the same AP, at
least once.

(The machine in question is a Lenovo laptop with an Intel
wireless card; if the details of the hardware are important
I can find them out, but I bet they aren't. It's running
Ubuntu 14.04 LTS which uses wpa_supplicant 2.1, but it
looks to me as if the relevant code is the same in more
recent versions of wpa_supplicant. Again, more details
are available on request.)

[I am not subscribed to the hostap@ list but will be
watching the archives.]

--
Gareth McCaughan



_______________________________________________
Hostap mailing list
Hostap@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/hostap



[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux