Hi all, I am testing what happens when a mesh peer suddenly disappears while I am sending data to it. I am using compat-wireless 2013-04-16 plus openwrt patches, and I operate on a clean 5GHz channel at HT40. I am sending a low throughput UDP stream, then I suddenly poweroff the receiver MP. On Wireshark I can see that the sending MP continues sending approx. 900 frames for about one second before giving up, which seems a very large number of frames (and a large detection delay). I investigated for the reason of so many frames, I found the following reasons, each of them bring questions. 1) ath9k does not give up before ATH_MAX_SW_RETRIES(=30) retries, some of them done by hardware and some by software in ath9k/xmit.c. Q: I saw that the number of retries is configurable in ath5k and nl80211 (NL80211_ATTR_WIPHY_RETRY_LONG). Should we use this instead of the constant, or is this obsolete/unsupported/inappropriate/whatever? 2) ath9k retries are computed from ts_longretry. But when the peer disappears, the last retries are subject to RTS, hence the retry count is in ts_shortretry instead. Q: shouldn't we add ts_shortretry and ts_longretry in ath9k/xmit.c/ath_tx_complete_aggr() and ath_tx_rc_status() ? 3) In the rate control table, rates 1,2,3 are subject to RTS. When RTS fails the shortretry count seems to be always 10, whatever is the retry count set by the rate control. From Wireshark I see that indeed there are 10 RTS retries per rate control slot. (I googled for this one, the point is rised in madwifi-devel but not answered) Q: is this a "feature" of the AR9160 ? or in the standard ? or a hidden constant ? 4) The mac80211 mesh_hwmp code computes an error rate with a decaying algorithm. Each time a TX frame fails it updates the error rate and gives up when the error is >95%. But when the ath9k driver retries 26x4 = 104 times, this accounts only for one failure. Q1: shouldn't we take into account the number of retries done by the underlying driver ? Q2: Isn't a 95% error rate damn high for a useable link ??? (just kidding - or not?) >From (4), 17 frames are required for the mesh to detect failure. From (3) 26 frames are sent by the hardware for each try. From (2) the ath9k resends 4 times each failed frame. Total 17*4*26 = 1768 frames before detecting peer failure. The observed count is lesser because at some point the driver starts aggregating. I plan to work on these issues - Any advice about the best ways to improve this? Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html