On Mon, Dec 10, 2012 at 7:48 AM, Chaoxing Lin <Chaoxing.Lin@xxxxxxxxxxxxxx> wrote: > TP> > TP>Are you talking about a different bug? > > GY> Hm, may bee, but according to Chaoxing Lin emails there is several bugs which cause performance degradation in 802.11s mode, and symptoms in my case indentical, i get same results as Chaoxing Lin, and seems same throbles, i will make tests what you want anyway and report results. > > For easy reference, I summarize the 4 problems I uncovered so far that contribute to in-stability of 7-node 802.11s network. > > 1. ath9k "Tx DMA error". Ping packet loss is seen each time "Fail to stop Tx DMA" log is seen. It's NOT the main cause. > > 2. authsae or 802.11s kernel problem: The two ends of a peer link get out of sync for whatever reason. One end says, the peer link is "ESTAB" and all 3 keys are in place. While the other end says this peer link is not "ESTAB", no keys installed for the peer. We recently applied https://github.com/cozybit/authsae/commit/0e5c65c3f773db820d6cee7b365cd4a70181c72d which may fix your issue. > 3. AES-CCM pairwise key sometimes complains packet replay so ping packets are dropped. A kernel key dump in this error case is below. (I overwrote key_key_read() function in debugfs_key.c to dump all info) > > Key 362: > 0xcf393800 AES-CCM Key: 49305a736a8b6d5fcb34057ee6983d44 Pairwise > Peer MAC: 00:0e:8e:38:36:03 > tx_pn: 000000000000009f > > > rx_pn[ 0]: 0000000d788b rx_pn[ 1]: 000000000000 rx_pn[ 2]: 000000000000 > rx_pn[ 3]: 000000000000 > rx_pn[ 4]: 000000000000 rx_pn[ 5]: 000000000000 rx_pn[ 6]: 000000000000 > rx_pn[ 7]: 000000000000 > rx_pn[ 8]: 000000000000 rx_pn[ 9]: 000000000000 rx_pn[10]: 000000000000 > rx_pn[11]: 000000000000 > rx_pn[12]: 000000000000 rx_pn[13]: 000000000000 rx_pn[14]: 000000000000 > rx_pn[15]: 000000000000 > rx_pn[16]: 000000003580 > > replays: 11970 icverror: <=======================problem here=========== > > The worse thing for problem 2 and 3 above is, when it gets into this state, the mpath still stays active. So all packets are still routed to the bad peer link/mpath and will be dropped by peer. ok. Patches are welcome. > 4. 802.11n packet aggregation. I believe this is the main problem by the fact that, disabling 802.11n packet aggregation in ath9k driver will make the network stable and problem 2 and 3 are not seen. In other words, problem 2 and 3 may be caused by aggregation (my imagination, aggregation caused certain error condition that is not handled properly, which triggers problem 2 and 3) And to reproduce you run a simultaneous ping from one node to ~6 others? It will take me a few days to find time to reproduce this, so any interesting observations you can offer in the mean time would be helpful. Thanks, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html