On Mon, Feb 24, 2025 at 3:31 PM Koichiro Den <koichiro.den@xxxxxxxxxxxxx> wrote: > > Both new_device_store and delete_device_store touch module global > resources (e.g. gpio_aggregator_lock). To prevent race conditions with > module unload, a reference needs to be held. > > Add try_module_get() in these handlers. > > For new_device_store, this eliminates what appears to be the most dangerous > scenario: if an id is allocated from gpio_aggregator_idr but > platform_device_register has not yet been called or completed, a concurrent > module unload could fail to unregister/delete the device, leaving behind a > dangling platform device/GPIO forwarder. This can result in various issues. > The following simple reproducer demonstrates these problems: > > #!/bin/bash > while :; do > # note: whether 'gpiochip0 0' exists or not does not matter. > echo 'gpiochip0 0' > /sys/bus/platform/drivers/gpio-aggregator/new_device > done & > while :; do > modprobe gpio-aggregator > modprobe -r gpio-aggregator > done & > wait > > Starting with the following warning, several kinds of warnings will appear > and the system may become unstable: > > ------------[ cut here ]------------ > list_del corruption, ffff888103e2e980->next is LIST_POISON1 (dead000000000100) > WARNING: CPU: 1 PID: 1327 at lib/list_debug.c:56 __list_del_entry_valid_or_report+0xa3/0x120 > [...] > RIP: 0010:__list_del_entry_valid_or_report+0xa3/0x120 > [...] > Call Trace: > <TASK> > ? __list_del_entry_valid_or_report+0xa3/0x120 > ? __warn.cold+0x93/0xf2 > ? __list_del_entry_valid_or_report+0xa3/0x120 > ? report_bug+0xe6/0x170 > ? __irq_work_queue_local+0x39/0xe0 > ? handle_bug+0x58/0x90 > ? exc_invalid_op+0x13/0x60 > ? asm_exc_invalid_op+0x16/0x20 > ? __list_del_entry_valid_or_report+0xa3/0x120 > gpiod_remove_lookup_table+0x22/0x60 > new_device_store+0x315/0x350 [gpio_aggregator] > kernfs_fop_write_iter+0x137/0x1f0 > vfs_write+0x262/0x430 > ksys_write+0x60/0xd0 > do_syscall_64+0x6c/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > [...] > </TASK> > ---[ end trace 0000000000000000 ]--- > > Fixes: 828546e24280 ("gpio: Add GPIO Aggregator") > Signed-off-by: Koichiro Den <koichiro.den@xxxxxxxxxxxxx> > --- Geert, does this look good to you? I'd like to send this fix upstream first and backport it to stable. Bartosz