On 09/12/2017 05:09 PM, James Cameron wrote:
Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks
rtl8821ae keep alive, causing "Connection to AP lost" and deauth, but
why?
Wireless connection is lost after a few seconds or minutes, on every
OLPC NL3 laptop with rtl8821ae, with any stable kernel after 4.10.1,
and any kernel with 40b368af4b75.
dmesg contains
wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost
iw event shows
wlp2s0: del station 2c:b0:5d:a6:86:eb
wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: Disassociated due to inactivity
wlp2s0 (phy #0): disconnected (local request)
Workaround is to bounce the link, then reconnect;
ip link set wlp2s0 down
ip link set wlp2s0 up
iw dev wlp2s0 connect qz
A nearby monitor host captures a deauthentication packet sent by the
device.
Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment
issues") which changes the width of DBI register read.
On the face of it, 40b368af4b75 looks correct, especially compared
against same function in rtl8723be.
I've no idea why reverting fixes the problem. I'm hoping someone here
might speculate and suggest ways to test.
As keep alive is set through this path, my guess is that keep alive is
not being set in the device. Or perhaps reading 16-bits perturbs
another register. Is there a way to test?
http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13
http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and revert
of 40b368af4b75
James,
I'm afraid we are needing to revisit this problem again. Changing that 8-bit
read to a 16-bit version causes an unaligned memory reference in AARCH64, thus
we will need to re-revert. To prevent problems on systems such as yours, PK
plans to turn off ASPM capability and backdoor in certain platforms that will be
listed in a quirks table. Please report the output of 'dmidecode -t system' for
you affected system(s).
We hope you will be able to test any proposed patches.
Thanks,
Larry