On 1/23/24 18:39, David Mosberger-Tang wrote: > On Tue, 2024-01-23 at 11:18 +0100, Alexis Lothoré wrote: >> On 1/22/24 23:03, David Mosberger-Tang wrote: [...] >> I have a working wilc-over-spi setup with which I can easily unplug the module, >> so I gave a try to your series, and while the lack of chip detect indeed makes >> the netdevice registration not executed, I've got a nasty kasan warning: >> >> driver_probe_device from __driver_attach+0x1a0/0x29c >> >> >> >> [141/1863] >> __driver_attach from bus_for_each_dev+0xf0/0x14c >> bus_for_each_dev from bus_add_driver+0x130/0x288 >> bus_add_driver from driver_register+0xd4/0x1c0 >> driver_register from do_one_initcall+0xfc/0x204 >> do_one_initcall from kernel_init_freeable+0x240/0x2a0 >> kernel_init_freeable from kernel_init+0x20/0x144 >> kernel_init from ret_from_fork+0x14/0x28 >> Exception stack(0xc3163fb0 to 0xc3163ff8) >> 3fa0: 00000000 00000000 00000000 00000000 >> 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >> 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> >> Allocated by task 1: >> kasan_set_track+0x3c/0x5c >> __kasan_kmalloc+0x8c/0x94 >> __kmalloc_node+0x64/0x184 >> kvmalloc_node+0x48/0x114 >> alloc_netdev_mqs+0x68/0x664 >> alloc_etherdev_mqs+0x28/0x34 >> wilc_netdev_ifc_init+0x34/0x37c >> wilc_cfg80211_init+0x278/0x330 >> wilc_bus_probe+0xb4/0x398 >> spi_probe+0xb8/0xdc >> really_probe+0x134/0x588 >> __driver_probe_device+0xe0/0x288 >> driver_probe_device+0x60/0x118 >> __driver_attach+0x1a0/0x29c >> bus_for_each_dev+0xf0/0x14c >> bus_add_driver+0x130/0x288 >> driver_register+0xd4/0x1c0 >> do_one_initcall+0xfc/0x204 >> kernel_init_freeable+0x240/0x2a0 >> kernel_init+0x20/0x144 >> ret_from_fork+0x14/0x28 >> >> Freed by task 1: >> kasan_set_track+0x3c/0x5c >> kasan_save_free_info+0x30/0x3c >> __kasan_slab_free+0xe4/0x12c >> __kmem_cache_free+0x94/0x1cc >> device_release+0x54/0xf8 >> kobject_put+0xf4/0x238 >> netdev_run_todo+0x414/0x7dc >> wilc_netdev_cleanup+0xe4/0x244 >> wilc_bus_probe+0x2b8/0x398 >> spi_probe+0xb8/0xdc >> really_probe+0x134/0x588 >> __driver_probe_device+0xe0/0x288 >> driver_probe_device+0x60/0x118 >> __driver_attach+0x1a0/0x29c >> bus_for_each_dev+0xf0/0x14c >> bus_add_driver+0x130/0x288 >> driver_register+0xd4/0x1c0 >> do_one_initcall+0xfc/0x204 >> kernel_init_freeable+0x240/0x2a0 >> kernel_init+0x20/0x144 >> ret_from_fork+0x14/0x28 >> >> It looks like an already existing/dormant issue in the error-managing path of >> spi probe of the driver (looks like we are trying to unregister a netdevice >> which has never been registered ?), but since your series triggers it, it should >> be handled too. > > I need help interpreting this. What does KASAN actually complain about? A > double free or something else? I see that the kasan dump from my last email is truncated, but the first line clearly mentions a use-after-free: ================================================================== BUG: KASAN: slab-use-after-free in wilc_netdev_cleanup+0x294/0x2c0 Read of size 4 at addr c3c91ce8 by task swapper/1 CPU: 0 PID: 1 Comm: swapper Not tainted 6.7.0-wt+ #843 Hardware name: Atmel SAMA5 unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x34/0x48 dump_stack_lvl from print_report+0x154/0x500 print_report from kasan_report+0xd8/0x100 kasan_report from wilc_netdev_cleanup+0x294/0x2c0 wilc_netdev_cleanup from wilc_bus_probe+0x2b8/0x398 wilc_bus_probe from spi_probe+0xb8/0xdc spi_probe from really_probe+0x134/0x588 really_probe from __driver_probe_device+0xe0/0x288 __driver_probe_device from driver_probe_device+0x60/0x118 driver_probe_device from __driver_attach+0x1a0/0x29c __driver_attach from bus_for_each_dev+0xf0/0x14c bus_for_each_dev from bus_add_driver+0x130/0x288 bus_add_driver from driver_register+0xd4/0x1c0 driver_register from do_one_initcall+0xfc/0x204 do_one_initcall from kernel_init_freeable+0x240/0x2a0 kernel_init_freeable from kernel_init+0x20/0x144 kernel_init from ret_from_fork+0x14/0x28 Not sure though what's wrong without digging further. > register_netdev() does get called (through wilc_cfg80211_init()) and then when > the chip detect fails, unregister_netdev() gets called (from > wilc_netdev_cleanup()), so I don't see any obvious issues, but there is a lot of > other stuff going on there that I'm not familiar with. My bad, your statement made me realize I overlooked things here: aside from the kasan warning, your patch makes the probe function do the following steps: - create and register (wiphy and) netdevice - check if chip is detected on bus - unregister/clean up everything if chip does not respond There's no point in pre-registering the netdevice so early if we add an error path due to chip being absent, I would even say that the whole point of your series is to prevent registering the device if chip can not be accessed. So IMHO chip detection should be done before trying to register the netdevice. It will prevent all the complications due to the whole reverse unregistration (and all weird issues that can occur because of device being registered while not being fully ready) >>> + if (base_id != WILC_1000_BASE_ID && base_id != WILC_3000_BASE_ID) { >> >> - WILC3000 is currently not supported (yet) by the upstream driver, so there is >> no reason to validate its presence if we can not handle it later. Any mention of >> it should then be removed from this series > > Oh, I didn't realize that. I was just going off of this web page: > > https://www.microchip.com/en-us/software-library/wilc1000_3000_linux_driver Understood, but again, your patch must be based on upstream trees :) -- Alexis Lothoré, Bootlin Embedded Linux and Kernel engineering https://bootlin.com