On 8/19/21 7:31 AM, Greg Kroah-Hartman wrote:
On Thu, Aug 19, 2021 at 10:57:03AM +0800, Hui Wang wrote:
On 8/18/21 5:04 PM, Marek Vasut wrote:
On 8/18/21 7:33 AM, Greg Kroah-Hartman wrote:
On Wed, Aug 18, 2021 at 12:06:15PM +0800, Hui Wang wrote:
Hi Marex,
We backported this patch to ubuntu 4.15.0-generic kernel, and
found this
patch introduced the rsi driver crashing when running system
resume on the
Dell 300x IoT platform (100% rate). Below is the log, After
seeing this log,
the rsi wifi can't work anymore, need to run 'rmmod rsi_sdio;modprobe
rsi_sdio" to make it work again.
So do you know what is missing apart from this patch or this
patch is not
suitable for 4.15 kernel at all?
Does 4.19.191 work for this system? Why not just use that or newer
instead?
I haven't seen this on linux-stable 5.4.y or 5.10.y, if that information
is of any use.
But I have to admit, I am tempted to mark the whole driver as BROKEN and
submit that for stable backports.
Because that is what it is, it is buggy, broken, and the hardware lacks
any documentation. I spent an insane amount of time talking to RedPine
Signals / SiLabs trying to get help with basic things like association
problems against various APs, no result there. I tried getting hardware
docs from them so I can fix the driver myself, no result either. So far
I tried to pick various fixes from their downstream driver and submit
them, but that is massively time consuming and the changes there are not
separated or documented, it is just one large chunk of code.
As far as I can tell, they also have no interest in fixing the driver or
helping others with fixing it, so maybe we should just mark it as broken
... :-(
Hi Marek,
Got it, thanks for sharing it.
Hi Greg,
I just tested the 4.19.191, got the same result, the wifi will crash after
resume under 4.19.191:
admin@HW6VB02:~$ uname -a
Linux HW6VB02 4.19.191 #1 SMP Thu Aug 19 10:19:32 CST 2021 x86_64 x86_64
x86_64 GNU/Linux
[ 59.682908] sdhci-acpi INT33BB:00: pre_suspend failed for non-removable
host: -38
[ 59.682917] Freezing user space processes ... (elapsed 0.003 seconds)
done.
[ 59.686063] OOM killer disabled.
[ 59.686065] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[ 59.687385] Suspending console(s) (use no_console_suspend to debug)
[ 59.687931] rsi_91x: ===> Interface DOWN <===
[ 70.068983] mmc1: Controller never released inhibit bit(s).
[ 70.068992] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 70.069002] mmc1: sdhci: Sys addr: 0xffffffff | Version: 0x0000ffff
[ 70.069009] mmc1: sdhci: Blk size: 0x0000ffff | Blk cnt: 0x0000ffff
[ 70.069016] mmc1: sdhci: Argument: 0xffffffff | Trn mode: 0x0000ffff
[ 70.069023] mmc1: sdhci: Present: 0xffffffff | Host ctl: 0x000000ff
[ 70.069030] mmc1: sdhci: Power: 0x000000ff | Blk gap: 0x000000ff
[ 70.069036] mmc1: sdhci: Wake-up: 0x000000ff | Clock: 0x0000ffff
[ 70.069043] mmc1: sdhci: Timeout: 0x000000ff | Int stat: 0xffffffff
So let us revert this commit from 4.19.y?
If you revert it, does it work properly? What about in Linus's tree?
I suspect in that case, sdio_claim_host() will spin indefinitely and
never finish, see the c434e5e48dc4e ("rsi: Use resume_noirq for SDIO")
commit message.
Note that I did my tests on ARM MMCI (stm32mp1 variant).
This "[ 70.068983] mmc1: Controller never released inhibit bit(s)"
looks suspicious in the log above.
Also, newer versions of the RSI downstream driver [1] as of 390542d
("Updated Readme.txt file") simply comment out
rsi_sdio_enable_interrupts() in rsi/rsi_91x_sdio.c rsi_resume(), which
looks like RSI ran into the same problem, but "fixed" it differently. I
think that approach RSI took is wrong and it just hid the issue.
[1] git://github.com/SiliconLabs/RS911X-nLink-OSD