On 2/11/22 11:15, Ajay.Kathat@xxxxxxxxxxxxx wrote:
On 10/02/22 21:55, Marek Vasut wrote:
On 2/10/22 17:19, Ajay.Kathat@xxxxxxxxxxxxx wrote:
Hi,
On 10/02/22 14:10, Christoph Niedermaier wrote:
From: Ajay.Kathat@xxxxxxxxxxxxx [mailto:Ajay.Kathat@xxxxxxxxxxxxx]
Sent: Wednesday, February 9, 2022 3:37 PM
On 08/02/22 21:56, Christoph Niedermaier wrote:
Hello,
I tested the wireless chip wilc1000 with the 5.16.5 Kernel and the
firmware v15.4.1
(https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/atmel/wilc1000_wifi_firmware-1.bin)
on an i.MX6 QUAD with iperf3:
# iperf3 -c IP_ADDR -P 16 -t 0
After a while the test gets stuck and I got the following kernel
messages:
mmc0: Timeout waiting for hardware interrupt.
mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
mmc0: sdhci: Sys addr: 0x138f0200 | Version: 0x00000002
mmc0: sdhci: Blk size: 0x00000158 | Blk cnt: 0x00000001
mmc0: sdhci: Argument: 0x14000158 | Trn mode: 0x00000013
mmc0: sdhci: Present: 0x01d88a0a | Host ctl: 0x00000013
mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000009f
mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
mmc0: sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000003
mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000a000
mmc0: sdhci: Cmd: 0x0000353a | Max curr: 0x00ffffff
mmc0: sdhci: Resp[0]: 0x00001000 | Resp[1]: 0x00000000
mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
mmc0: sdhci: Host ctl2: 0x00000000
mmc0: sdhci: ADMA Err: 0x00000007 | ADMA Ptr: 0x4c041200
mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP
=========
mmc0: sdhci-esdhc-imx: cmd debug status: 0x2100
mmc0: sdhci-esdhc-imx: data debug status: 0x2200
mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
mmc0: sdhci-esdhc-imx: dma debug status: 0x2402
mmc0: sdhci-esdhc-imx: adma debug status: 0x25b4
mmc0: sdhci-esdhc-imx: fifo debug status: 0x2610
mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2751
mmc0: sdhci: ============================================
wilc1000_sdio mmc0:0001:1: wilc_sdio_cmd53..failed, err(-110)
wilc1000_sdio mmc0:0001:1: Failed cmd53 [0], bytes read...
I tried to reduce the clock speed to 20MHz in the devicetree with
max-frequency = <20000000>;
but the problem then also occurs.
Is this a possible bug?
Hi Ajay,
Thanks for the answer.
The bus error seems to be specific to the host during the SDIO
transfer.
How long does it take to reproduce it? Does the issue also happen
without "-P 16" iPerf3 option?
It takes about 10s (something a bit longer) till I got this kernel
error
messages and it doesn't matter if I use it with "-P 16" or without.
I did not observe the issue with my setup(SAMA5D4 XPLAINED + WILC1000
SDIO) when tested iPerf for a longer duration(~1000sec). I suspect the
issue could be related to the SDHCI host controller.
Try to debug the host controller side for the possible cause of timeout.
It seems the timeout happens because the card fails to respond to SDIO
command 53, right ?
Yes, the timeout could be for any reason like either the CMD53 has not
reached to chip or response not received correctly at host end.
The problem happens seconds or tens of seconds into the test, so there
must've been CMD53 which reached the card before the problem occurred,
and there must have been a lot of those CMD53 before the problem
happened too, since CMD53 seems to be some data transfer CMD ?
Is there some error logging/tracing functionality in the WILC1000
firmware which can provide further information why the card did not
respond ?
WILC1000 SD module has UART serial debug port for firmware logs but I
don't think it would be useful here because it needs to be debug/probe
at SDIO bus level.
Is there some other kind of logging which can tell us more details on
where to look for this problem ?
Maybe we can try monitoring the SDIO traffic with ftrace ?
Any other options, short of taking the hardware apart ?
Could it be the card suffered some sort of FIFO overflow ? The MX6Q is a
bit more performant than the CA7 (I think?) SAMA5D4, so maybe that plays
some role ?
As I understand, the issue is observed with basic iPerf testing(less
throughput) so not sure if the host performance will have such an
impact. IIRC few of the customers are using the same host(i.MX6) though
I am not sure if it's over SPI or SDIO bus. Till now, I have not come
across such limitations with the specific host.
That iperf -P 16 hammers the chip with a lot of short packets, the
problem does not occur during iperf3 -P 1 run or UDP iperf3 run (that's
the one with low traffic). Here the interface is saturated, that's why I
speculate some sort of FIFO overrun is happening.
I have also noticed there are some wilc1000 downstream drivers with huge
stacks of patches, but I never really figured out whether those are
still relevant or whether the upstream wilc1000 driver is perfectly
fine. I would like to believe it is the later, is it ?