Am 13.01.2017 um 11:21 schrieb Jouni Malinen:
On Mon, Jan 09, 2017 at 02:10:51PM +0100, Thomas Graf wrote:
I tracked down the issue to lost EAPOL packets. I'm using the ath9k driver
which is repeating packets 10 times. The problem is when they are lost.
Are you saying that you have an environment where 10 retries of a single
EAPOL frame is not sufficient to get it through in significant number of
cases? That sounds quite extreme environment..
Yes. And unfortunately that is not really extreme. That 10 Retries are
carried out in 2msec and then the frame is lost.
Even in environments on 5GHz you have that case because of modern access
point controller systems like from Cisco: A lot of admins activate to
scan on different channels for automatic channel selection/survey. That
simply means that in some interval the access point looks for a
different channel and is not available for about 60msec.
Would you be able to share a wireless sniffer capture showing that type
of case with connection failing?
http://thomas-graf.de/Downloads/LostEAPOL.pcap
In the middle of authentication an EAP-Response frame is retried for 10
times (in 2,2msec!). In Frame 20 you see a deauthentication of my
currently used work around:
I measure the time between EAPOL-Frames and if the wpa_supplicant pauses
I use wpa_cli commands to immediately retry the authentication which
usually then is successful. Otherwise without my work around it would
now wait for the usual long timeouts.
On a previous proprietary platform the usual way was to flag EAPOL packets
for more retries and the wpa-supplicant was notified of unsuccessful
transmission.
The IEEE 802.1X Authenticator on the AP is expected to retry EAPOL
frames that do not get a response. Are you seeing that happening? And
the same issue with following 10 retries from the station are all lost
again? That would sound like a really horrible radio environment to the
point of being useless in practice. Or there is something else causing
connection issues..
Unfortunately all higher level retries are quite long. It's not that
it's not retrying, but it takes several seconds, which is not acceptable
for me.
We have a clear situation in my point of view. The client knows on
driver/kernel-level that it did not receive an ACK on that frame and
should somehow react faster. My work around works much faster than
relying on the long timeouts.
On linux with wpa_supplicant I only see long timeouts after lost EAPOL
frames. I suppose in mac80211 there should be some extended handling for
EAPOL frames.
Always waiting for the timeout and then restarting from the beginning is not
a good solution.
So is this a completely failure to connect or just something that takes
a long time because of the Authenticator retries being after couple of
seconds?
The retry comes but then with LONG time without working connection. That
is not nice for an environment where I need roaming events every 30sec.
For my point of view it's missing the direct feedback from the kernel to
supplicant about the lost frame OR any way to tell the kernel to keep on
retrying to send that frame for longer time. I think it would be
sufficient to retry unacknowledged EAPOL-Frames with exponential backoff
in about 150msec.
Thomas
_______________________________________________
Hostap mailing list
Hostap@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/hostap