John, > > I suspect that some memory is getting overwritten or something. The > > embedded struct thing was a bit of a hack. > > > > Also please put a printk into the iwlwifi code and into > > net/mac80211/mlme.c where it assigns > > sdata->vif.bss_conf.beacon_int = bss->cbss.beacon_interval; > > > > so that I can see the order this happening. > > I have applied the patch attached as "debug.patch". I booted F10, > logged-in, and allowed NetworkManager to establish a connection. > The resulting dmesg output is attached as "dmesg.txt". > > Hth! Let me know if you want more... Sorry for the delay. Reinette made me aware that I'd missed to work on this! Unfortunately, I don't see anything going wrong. The problem is in this sequence: > net/wireless/scan.c 244 > ffff88006ed59de0 here we find the BSS to use > wlan0: authenticate with AP 00:18:84:80:c6:b1 and authenticate with it > net/wireless/scan.c 244 > (null) but now it has expired (I can only guess where this is called from) > wlan0: authenticated > wlan0: associate with AP 00:18:84:80:c6:b1 we associate > net/wireless/scan.c 244 > (null) > net/wireless/scan.c 244 > (null) not sure > wlan0: RX AssocResp from 00:18:84:80:c6:b1 (capab=0x421 status=0 aid=1) > wlan0: associated > net/wireless/scan.c 244 > (null) this is this code: bss_info_changed |= BSS_CHANGED_ASSOC; ifmgd->flags |= IEEE80211_STA_ASSOCIATED; bss = ieee80211_rx_bss_get(local, ifmgd->bssid, conf->channel->center_freq, ifmgd->ssid, ifmgd->ssid_len); if (bss) { /* set timing information */ in ieee80211_set_associated > drivers/net/wireless/iwlwifi/iwl-agn.c 643 > 0 because "bss" is NULL, we don't actually set the timing information, and thus iwlwifi gets a 0 value here. For some reason now your wpa_supplicant decides to disassoc, and on the second try it's all fine: > net/wireless/scan.c 244 > ffff88006ed5b4b0 > wlan0: authenticate with AP 00:18:39:5b:82:ca > net/wireless/scan.c 244 > ffff88006ed5b4b0 > wlan0: authenticated > wlan0: associate with AP 00:18:39:5b:82:ca > net/wireless/scan.c 244 > ffff88006ed5b4b0 > net/wireless/scan.c 244 > ffff88006ed5b4b0 > wlan0: RX AssocResp from 00:18:39:5b:82:ca (capab=0x411 status=0 aid=6) > wlan0: associated > net/wireless/scan.c 244 > ffff88006ed5b4b0 > net/mac80211/mlme.c 641 > 100 > drivers/net/wireless/iwlwifi/iwl-agn.c 643 > 100 note the extra "mlme.c 641 / 100" lines, these are in the "if (bss)" part. Now, thinking a little about why this happens... Before using cfg80211's BSS structs, we *never* expired any BSS structs. We just hid them from userspace. This was a bug, we would forever accumulate BSSes and use loads of memory. However, what happened above could never happen: that a BSS struct went away while we were trying to use it!! Oddly, this isn't supposed to happen, since we only expire structs after 10 seconds and we continually receive beacons this shouldn't happen. To make completely sure, can you, in addition to this debug patch, add a printk into net/wireless/scan.c:cfg80211_bss_expire and print out the pointer for any expired BSS? I suspect we'd have seen ffff88006ed59de0 there. In any case, the proper solution here would be to internally keep a reference in mac80211 to the "current BSS" struct, and hang on to it (ie. increase the refcount, and never use the lookup function but the pointer we have gotten)... Another thing we should do is to not overwrite in cfg80211_bss_update by replacing the node, but by reallocating the node with the new information. This needs some per-node locking though, I suspect. Kalle, I suspect that your beacon offload will require something like holding on to BSS structs too, is that correct? johannes
Attachment:
signature.asc
Description: This is a digitally signed message part