Search Linux Wireless

Re: [PATCH v3] brcmfmac: prevent double-free on hardware-reset.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I missed the response for a while. Following up in-line.

------- Original Message -------
On Wednesday, June 22nd, 2022 at 14:36, Arend van Spriel <arend.vanspriel@xxxxxxxxxxxx> wrote:

> + Uffe
>
> On 6/17/2022 4:24 PM, Danny van Heumen wrote:
>
> > From f1fcceb65d4a44c078cd684ea25a2f2c7f53deb2 Mon Sep 17 00:00:00 2001
> > From: Danny van Heumen danny@xxxxxxxxxxxxxxxxx
> > Date: Tue, 24 May 2022 18:30:50 +0200
> > Subject: [PATCH v3] brcmfmac: prevent double-free on hardware-reset.
> >
> > In case of buggy firmware, brcmfmac may perform a hardware reset. If during
> > reset and subsequent probing an early failure occurs, a memory region is
> > accidentally double-freed. With hardened memory allocation enabled, this error
> > will be detected.
> >
> > - return early where appropriate to skip unnecessary clean-up.
> > - set '.freezer' pointer to NULL to prevent double-freeing under possible
> > other circumstances and to re-align result under various different
> > behaviors of memory allocation freeing.
> >
> > Given a firmware crash, function 'brcmf_sdio_bus_reset' is called. It calls
> > 'brcmf_sdiod_remove', then follows up with 'brcmf_sdiod_probe' to reinitialize
> > the hardware. If 'brcmf_sdiod_probe' fails to "set F1 blocksize", it exits
> > early, which includes calling 'brcmf_sdiod_remove'. In both cases
> > 'brcmf_sdiod_freezer_detach' is called to free allocated '.freezer', which
> > has not yet been re-allocated the second time.
>
>
> For testing you can also trigger the brcmf_sdio_bus_reset() through debugfs:
>
> # cd /sys/kernel/debug/ieee80211/phyX
> # echo 1 > reset

This works. I have done a few resets in a loop to see if the hardware keeps
recovering and no other issues occur. Everything seems to be fine.


>
>
> So I did that without your patch and observed following:
>
> [ 531.481045] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize
>
> [ 531.487314] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after
> sdio device
> reset: ret -123
>
> [ 531.495893] mmc0: card 0001 removed
>
> [ 531.550567] mmc0: new high speed SDIO card at address 0001
>
> [ 531.556561] brcmfmac: F1 signature read @0x18000000=0x16044330
>
> So I looked in brcmf_sdio_bus_reset() and noticed that this function
> actually does a call to mmc_hw_reset() which is what I suggested
> earlier. Looking at this log it seems the actual remove and subsequent
> rescan is a deferred work and it will take care of probing the driver anew.

I can confirm that this works. This is helpful.


>
> So I changed debug message level for the sdio ops and then I see:
>
> [ 1327.686828] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize
>
> [ 1327.693091] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after
> sdio device
> reset: ret -123
>
> [ 1327.701626] brcmfmac: brcmf_ops_sdio_remove Enter
>
> [ 1327.706333] brcmfmac: brcmf_ops_sdio_remove Function: 1
>
> [ 1327.711557] brcmfmac: brcmf_ops_sdio_remove Exit
>
> [ 1327.716203] brcmfmac: brcmf_ops_sdio_remove Enter
>
> [ 1327.720911] brcmfmac: brcmf_ops_sdio_remove Function: 2
>
> [ 1327.726135] brcmfmac: brcmf_ops_sdio_remove Exit
>
> [ 1327.730768] mmc0: card 0001 removed
>
> [ 1327.785458] mmc0: new high speed SDIO card at address 0001
>
> [ 1327.791068] brcmfmac: brcmf_ops_sdio_probe Enter
>
> [ 1327.795687] brcmfmac: brcmf_ops_sdio_probe Function#: 1
>
> [ 1327.800988] brcmfmac: brcmf_ops_sdio_probe Enter
>
> [ 1327.805610] brcmfmac: brcmf_ops_sdio_probe Function#: 2
>
> [ 1327.811317] brcmfmac: F1 signature read @0x18000000=0x16044330
>
> So to me it seems mmc_hw_reset() is doing everything we need and we can
> make brcmf_sdio_bus_reset() a bit simpler. The only scary part here is
> that brcmf_ops_sdio_remove() is called first for func1 and then for
> func2. Upon rmmod this is different, ie. first func2 and then func1.
> Looking at the implementation in brcmf_ops_sdio_remove() it means we run
> into use-after-free upon mmc_hw_reset().

I have looked into this. The logic itself protects from duplicate behavior
(at least partially) by checking which func is being removed. However,
irq-releases are duplicated because the check only happens after that logic.
This is also necessary, because the kernel warns if the irq isn't released
after executing the callback.

In the next patch (prefixed "PATCH v4", already sent), I have slightly
adjusted the control flow such that irq-release happens for each func
-- as per expectation -- while the removal logic remains rooted at func1,
but executed based on the full device (`sdiodev` as function arg). This
should, IIUC, be correct regardless of function order.

This solution isn't a perfect redesign, but should respect both the design
itself and also sdio driver expectations.

Please find the patch at: g_Py6bM1lfcJOWWmHwKU8x4tCFrTRdgFtoM13qYHeN441F392j_6etJnEJ8gHJMRZ6OEKxpJYuP45x3iziHqY6HNXnVwIiyvJLYjvzxT0Xk=%20()%20dannyvanheumen%20!%20nl

Regards,
Danny






[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux