Search Linux Wireless

Re: [PATCH v2] ath10k: add retry mechanism for ath10k_start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Wen Gong <wgong@xxxxxxxxxxxxxx> writes:

> On 2020-02-13 19:35, Kalle Valo wrote:
>>
>> I'm not convinved about this. ath10k assumes that SDIO bus works
>> reliably and there's no data loss. In my opinion if the SDIO is not
>> working reliably we should fail immediately with a clear error message
>> for the user, instead of having an unstable connection. And I
>> understand
>> from the logs that ath10k fails cleanly in this simulated failure.
>>
>> So what you do here is ignore the assumption that the SDIO bus should
>> always work reliably and add a workaround by trying to restart the
>> firmware multiple times, and hope that by luck it works during one
>> of 10
>> retry attempts. But then what? Isn't the WLAN connection flaky as SDIO
>> bus is not reliable? So if we were to follow that design logic,
>> shouldn't we add retries for _all_ ath10k SDIO transactions? But that
>> would make ath10k even more complex as it is.
>
> for other SDIO transfer, like data tx/rx, if it fail, the upper stack
> has error mechanism to handle the fail.

Handle the fail is different from retrying. I'm all that all error cases
need to be gracefully handled and bailed out with a clear error, but I
have not seen any logic that mac80211 or driver should retry
transmissions to firmware because of hardware errors. Just as an
example, if there is data loss on the PCI/SDIO bus I don't think the
ath10k credit handling will work for long.

> but for ath10k_start, if it fails, especailly for recovery, then it can
> not recovery again, because cfg80211_shutdown_all_interfaces, and it
> need
> to reboot system to recovery wlan by test.
>>
>> Because I think this patch makes things worse for the user, so I would
>> like to understand the real life use case this patch is trying to fix
>> and how it would help the user.
>
> sometimes it has recovery/suspend/resume test case, it need to make sure
> ath10k_start success, otherwise wlan will can not recovery unless reboot
> system.

If this works 99% of the time and 1% is failing then you should find the
root cause for that 1% case and fix that. The bug might be in ath10k, in
SDIO controller driver or maybe even somewhere else.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux