Search Linux Wireless

Re: [PATCH] wifi: wilc1000: validate chip id during bus probe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/25/24 07:23, Ajay.Kathat@xxxxxxxxxxxxx wrote:
> On 1/24/24 11:45, David Mosberger-Tang wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> On Wed, 2024-01-24 at 10:31 -0700, David Mosberger-Tang wrote:
>>> Alexis,
>>>
>>> On Wed, 2024-01-24 at 10:01 +0100, Alexis Lothoré wrote:
>>>> ==================================================================
>>>> BUG: KASAN: slab-use-after-free in wilc_netdev_cleanup+0x294/0x2c0
>>>> Read of size 4 at addr c3c91ce8 by task swapper/1

Replying a bit late to your initial questions:
- I am running an ARM32 platform (SAMA5D27)
- for the wilc_netdev_cleanup line, the 0x294 offset indeed points to
list_for_each_entry_rcu(vif, &wilc->vif_list, list) in my case, but you seemed
to have identified this already.

>>>
>>> OK, I think I see what's going on: it's the list traversal.  Here is what
>>> wilc_netdev_cleanup() does:
>>>
>>>       list_for_each_entry_rcu(vif, &wilc->vif_list, list) {
>>>               if (vif->ndev)
>>>                       unregister_netdev(vif->ndev);
>>>       }
>>>
>>> The problem is that "vif" is the private part of the netdev, so when the netdev
>>> is freed, the vif structure is no longer valid and list_for_each_entry_rcu()
>>> ends up dereferencing a dangling pointer.

Your diagnostic sounds relevant :)

>>> Ajay or Alexis, could you propose a fix for this - this is not something I'm
>>> familiar with.
>>
>> Actually, after staring at the code a little longer, is there anything wrong
>> with delaying the unregister_netdev() call until after the vif has been removed
>> from the vif-list?  Something along the lines of the below.

I guess you _could_ do something like this, but I think you have to be very
careful about potential races. If I'm not wrong the following could happen:
- you enter the wilc_netdev_cleanup and start removing vifs from list
- meanwhile, because your net device is still registered, userspace can start
calling concurrently some cgf80211_ops
- some of those ops may try to get the vif matching your netdevice from the
list, but it is not there anymore

Maybe some rtnl or wiphy lock (used from cfg80211 core or net core) will save
you from this for some of wilc_netdev_cleanup calls, but I think that won't be
true for the one in the error path of the probe chain.

> I think we need to investigate the actual reason for Kasan warning.
> First, we need to confirm if this warning exists without the patch(test
> by simulating a force error in wilc_bus_probe()). When
> wilc_netdev_cleanup() is also called from wilc_bus_remove(), I believe
> this warning was not observed. Once it is confirmed, the fix can be done
> accordingly.

It happens too in wilc_bus_remove:
echo "spi0.1" > /sys/bus/spi/drivers/wilc1000_spi/unbind

==================================================================
BUG: KASAN: slab-use-after-free in wilc_netdev_cleanup+0xf0/0x244
Read of size 4 at addr c3c91ce8 by task sh/91

CPU: 0 PID: 91 Comm: sh Not tainted 6.7.0-wt+ #845
Hardware name: Atmel SAMA5
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x34/0x48
 dump_stack_lvl from print_report+0x154/0x500
 print_report from kasan_report+0xd8/0x100
 kasan_report from wilc_netdev_cleanup+0xf0/0x244
 wilc_netdev_cleanup from wilc_bus_remove+0x50/0x5c
 wilc_bus_remove from spi_remove+0x40/0x50
 spi_remove from device_release_driver_internal+0x21c/0x2ac
 device_release_driver_internal from unbind_store+0x70/0xac
 unbind_store from kernfs_fop_write_iter+0x1a0/0x284
 kernfs_fop_write_iter from vfs_write+0x38c/0x6f4
 vfs_write from ksys_write+0xd8/0x178
 ksys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xc588ffa8 to 0xc588fff0)
ffa0:                   00000007 004eff68 00000001 004eff68 00000007 00000001
ffc0: 00000007 004eff68 00000001 00000004 00000007 b69546c0 00000020 004ed190
ffe0: 00000004 b6954678 aec3a041 aebd1a26

> Avoiding netdev initialization in wilc_cfg80211_init() would require lot
> of modification and changes in API call order. IMO the Kasan warning fix
> should be independent of netdev initialization order and it should
> free-up the resources for all cases. Suppose if the card is not present,
> the probe API should clean-up and exit gracefully. For detecting the
> chip_id, I have created the below patch considering the above scenarios,
> please check if it makes sense.

Agree about the wilc_netdev_cleanup fix being a separated topic, to be handled
accordingly.
Still, the scenario I mention above, if true, makes me more confident that we
should not register at all the netdevice before being able to manage it. Maybe
those cases are already handled correctly with some checks to make sure no real
crash occurs, but all those checks could be avoided if we did not register the
netdevice so early

Alexis

-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com





[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux