On 7/14/2022 4:39 PM, Ulf Hansson wrote:
On Thu, 14 Jul 2022 at 14:49, Arend Van Spriel <aspriel@xxxxxxxxx> wrote:
On 7/14/2022 12:04 PM, Ulf Hansson wrote:
On Tue, 12 Jul 2022 at 01:22, Danny van Heumen <danny@xxxxxxxxxxxxxxxxx> wrote:
In case of buggy firmware, brcmfmac may perform a hardware reset. If during
reset and subsequent probing an early failure occurs, a memory region is
accidentally double-freed. With hardened memory allocation enabled, this error
will be detected.
- return early where appropriate to skip unnecessary clean-up.
- set '.freezer' pointer to NULL to prevent double-freeing under possible
other circumstances and to re-align result under various different
behaviors of memory allocation freeing.
- correctly claim host on func1 for disabling func2.
- after reset, do not initiate probing immediately, but rely on events.
Given a firmware crash, function 'brcmf_sdio_bus_reset' is called. It calls
'brcmf_sdiod_remove', then follows up with 'brcmf_sdiod_probe' to reinitialize
the hardware. If 'brcmf_sdiod_probe' fails to "set F1 blocksize", it exits
early, which includes calling 'brcmf_sdiod_remove'. In both cases
'brcmf_sdiod_freezer_detach' is called to free allocated '.freezer', which
has not yet been re-allocated the second time.
Stacktrace of (failing) hardware reset after firmware-crash:
Code: b9402b82 8b0202c0 eb1a02df 54000041 (d4210000)
ret_from_fork+0x10/0x20
kthread+0x154/0x160
worker_thread+0x188/0x504
process_one_work+0x1f4/0x490
brcmf_core_bus_reset+0x34/0x44 [brcmfmac]
brcmf_sdio_bus_reset+0x68/0xc0 [brcmfmac]
brcmf_sdiod_probe+0x170/0x21c [brcmfmac]
brcmf_sdiod_remove+0x48/0xc0 [brcmfmac]
kfree+0x210/0x220
__slab_free+0x58/0x40c
Call trace:
x2 : 0000000000000040 x1 : fffffc00002d2b80 x0 : ffff00000b4aee40
x5 : ffff8000013fa728 x4 : 0000000000000001 x3 : ffff00000b4aee00
x8 : ffff800009967ce0 x7 : ffff8000099bfce0 x6 : 00000006f8005d01
x11: ffff8000099bfce0 x10: 00000000fffff000 x9 : ffff8000083401d0
x14: 0000000000000000 x13: 657a69736b636f6c x12: 6220314620746573
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000030
x20: fffffc00002d2ba0 x19: fffffc00002d2b80 x18: 0000000000000000
x23: ffff00000b4aee00 x22: ffff00000b4aee00 x21: 0000000000000001
x26: ffff00000b4aee00 x25: ffff0000f7753705 x24: 000000000001288a
x29: ffff80000a22bbf0 x28: ffff000000401200 x27: 000000008020001a
sp : ffff80000a22bbf0
lr : kfree+0x210/0x220
pc : __slab_free+0x58/0x40c
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Workqueue: events brcmf_core_bus_reset [brcmfmac]
Hardware name: Pine64 Pinebook Pro (DT)
CPU: 2 PID: 639 Comm: kworker/2:2 Tainted: G C 5.16.0-0.bpo.4-arm64 #1 Debian 5.16.12-1~bpo11+1
nvmem_rockchip_efuse industrialio_triggered_buffer videodev snd_soc_core snd_pcm_dmaengine kfifo_buf snd_pcm io_domain mc industrialio mt>
Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reje>
Internal error: Oops - BUG: 0 [#1] SMP
kernel BUG at mm/slub.c:379!
Signed-off-by: Danny van Heumen <danny@xxxxxxxxxxxxxxxxx>
I have to admit that, to me, it looks a bit weird to have one driver
instance managing two different SDIO func devices. On the other hand,
I don't know the HW so there might be good reasons for why.
In any case, I want to point out a commit [1] for the mwifiex driver,
which could serve as a good inspiration of how to make use of the
mmc_hw_reset(). For example, one may look at the return code from
mmc_hw_reset() to understand whether the reset will be done
synchronous or asynchronous (via device re-registration).
Thanks, Uffe
Could the API be extended so the caller can request the reset to be
asynchronous.
Yes, that should be fine. The current behaviour is that it will be
asynchronous if there are more than one SDIO func device bound to a
driver.
I see. So for brcmfmac we fall into that category. Thanks for the info.
Regards,
Arend