On 09/12/2017 05:09 PM, James Cameron wrote:
Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks
rtl8821ae keep alive, causing "Connection to AP lost" and deauth, but
why?
Wireless connection is lost after a few seconds or minutes, on every
OLPC NL3 laptop with rtl8821ae, with any stable kernel after 4.10.1,
and any kernel with 40b368af4b75.
dmesg contains
wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost
iw event shows
wlp2s0: del station 2c:b0:5d:a6:86:eb
wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: Disassociated due to inactivity
wlp2s0 (phy #0): disconnected (local request)
Workaround is to bounce the link, then reconnect;
ip link set wlp2s0 down
ip link set wlp2s0 up
iw dev wlp2s0 connect qz
A nearby monitor host captures a deauthentication packet sent by the
device.
Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment
issues") which changes the width of DBI register read.
On the face of it, 40b368af4b75 looks correct, especially compared
against same function in rtl8723be.
I've no idea why reverting fixes the problem. I'm hoping someone here
might speculate and suggest ways to test.
As keep alive is set through this path, my guess is that keep alive is
not being set in the device. Or perhaps reading 16-bits perturbs
another register. Is there a way to test?
http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13
http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and revert
of 40b368af4b75
James,
Thank you very much for making the effort to bisect this problem. I know that
several people have reported the problem, which we cannot duplicate; however,
most of them just say it drops the connection and do nothing more. In fact, we
are lucky to have them even report which kernel version they are running!
As we do not see the problem, we will be relying on you to help diagnose the
issue. Merely changing the read from 8 to 16 bits should not cause any change.
As _rtl8821ae_dbi_read() is only called from _rtl8821ae_enable_aspm_back_door(),
we want to test turning off ASPM. The following patch will accomplish this.
Unfortunately, the patch is white-space damaged, thus you will need to apply it
manually. Please try it to see if it helps your connection loss. Note that ASPM
settings are preserved through a module unload/reload sequence. Thus you will
need to reboot after rebuilding the driver.
diff --git a/rtl8821ae/hw.c b/rtl8821ae/hw.c
index 305b3abbf..755d3704b 100644
--- a/rtl8821ae/hw.c
+++ b/rtl8821ae/hw.c
@@ -1982,8 +1982,8 @@ int rtl8821ae_hw_init(struct ieee80211_hw *hw)
ppsc->rfpwr_state = ERFON;
rtlpriv->cfg->ops->set_hw_reg(hw, HW_VAR_ETHER_ADDR, mac->mac_addr);
- _rtl8821ae_enable_aspm_back_door(hw);
- rtlpriv->intf_ops->enable_aspm(hw);
+ //_rtl8821ae_enable_aspm_back_door(hw);
+ //rtlpriv->intf_ops->enable_aspm(hw);
if (rtlhal->hw_type == HARDWARE_TYPE_RTL8812AE &&
(rtlhal->rfe_type == 1 || rtlhal->rfe_type == 5))
Thanks,
Larry