Hi Kent, On Tue, 20 Feb 2024 22:29:59 +0800 Kent Gibson <warthog618@xxxxxxxxx> wrote: > On Tue, Feb 20, 2024 at 12:10:18PM +0100, Herve Codina wrote: > > When gpio chip device is removed while some related gpio are used by the > > user-space, the following warning can appear: > > remove_proc_entry: removing non-empty directory 'irq/233', leaking at least 'gpiomon' > > WARNING: CPU: 2 PID: 72 at fs/proc/generic.c:717 remove_proc_entry+0x190/0x19c > > ... > > Call trace: > > remove_proc_entry+0x190/0x19c > > unregister_irq_proc+0xd0/0x104 > > free_desc+0x4c/0xc4 > > irq_free_descs+0x6c/0x90 > > irq_dispose_mapping+0x104/0x14c > > gpiochip_irqchip_remove+0xcc/0x1a4 > > gpiochip_remove+0x48/0x100 > > ... > > > > Indeed, the gpio cdev uses an IRQ but this IRQ is not released when the > > gpio chip device is removed. > > > > Release IRQs used in the device removal notifier functions. > > Also move one of these function definition in order to avoid a forward > > declaration (move after the edge_detector_stop() definition). > > > > Signed-off-by: Herve Codina <herve.codina@xxxxxxxxxxx> > > --- > > drivers/gpio/gpiolib-cdev.c | 33 ++++++++++++++++++++++----------- > > 1 file changed, 22 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c > > index 2a88736629ef..aec4a4c8490a 100644 > > --- a/drivers/gpio/gpiolib-cdev.c > > +++ b/drivers/gpio/gpiolib-cdev.c > > @@ -688,17 +688,6 @@ static void line_set_debounce_period(struct line *line, > > GPIO_V2_LINE_FLAG_EVENT_CLOCK_HTE | \ > > GPIO_V2_LINE_EDGE_FLAGS) > > > > -static int linereq_unregistered_notify(struct notifier_block *nb, > > - unsigned long action, void *data) > > -{ > > - struct linereq *lr = container_of(nb, struct linereq, > > - device_unregistered_nb); > > - > > - wake_up_poll(&lr->wait, EPOLLIN | EPOLLERR); > > - > > - return NOTIFY_OK; > > -} > > - > > static void linereq_put_event(struct linereq *lr, > > struct gpio_v2_line_event *le) > > { > > @@ -1189,6 +1178,23 @@ static int edge_detector_update(struct line *line, > > return edge_detector_setup(line, lc, line_idx, edflags); > > } > > > > +static int linereq_unregistered_notify(struct notifier_block *nb, > > + unsigned long action, void *data) > > +{ > > + struct linereq *lr = container_of(nb, struct linereq, > > + device_unregistered_nb); > > + int i; > > + > > + for (i = 0; i < lr->num_lines; i++) { > > + if (lr->lines[i].desc) > > + edge_detector_stop(&lr->lines[i]); > > + } > > + > > Firstly, the re-ordering in the previous patch creates a race, > as the NULLing of the gdev->chip serves to numb the cdev ioctls, so > there is now a window between the notifier being called and that numbing, > during which userspace may call linereq_set_config() and re-request > the irq. Well in my previous patch, if gdev->chip need to NULL before the call to gcdev_unregister(), this can be done. I did modification that leads to the following sequence: --- 8< --- ... gcdev_unregister(gdev); gpiochip_free_hogs(gc); /* Numb the device, cancelling all outstanding operations */ gdev->chip = NULL; gpiochip_irqchip_remove(gc); acpi_gpiochip_remove(gc); of_gpiochip_remove(gc); gpiochip_remove_pin_ranges(gc); ... --- 8< --- I can call gcdev_unregister() right after gdev->chip = NULL. The needed things is to have free_irq() (from the gcdev_unregister()) called before calling gpiochip_irqchip_remove(). And so, why not: --- 8< --- ... gpiochip_free_hogs(gc); /* Numb the device, cancelling all outstanding operations */ gdev->chip = NULL; gcdev_unregister(gdev); gpiochip_irqchip_remove(gc); acpi_gpiochip_remove(gc); of_gpiochip_remove(gc); gpiochip_remove_pin_ranges(gc); ... --- 8< --- > > There is also a race here with linereq_set_config(). That can be prevented > by holding the lr->config_mutex - assuming the notifier is not being called > from atomic context. I missed that one and indeed, I probably can take the mutex. With the mutex holded, no more race condition with linereq_set_config() and so the IRQ cannot be re-requested. > > You also have a race with the line being freed that could pull the > lr out from under you, so a use after free problem. I probably missed something but I don't see this use after free. Can you give me some details/pointers ? > I'd rather live with the warning :(. > Fixing that requires rethinking the lifecycle management for the > linereq/lineevent. Well, currently the warning is a big one with a dump_stack included. It will be interesting to have it fixed. The need to fix it is to have free_irq() called before gpiochip_irqchip_remove(); Is there really no way to have this correct sequence without rethinking all the lifecycle management ? Also, after the warning related to the IRQ, the following one is present: --- 8< --- [ 9593.527961] gpio gpiochip9: REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED [ 9593.535602] ------------[ cut here ]------------ [ 9593.540244] WARNING: CPU: 0 PID: 309 at drivers/gpio/gpiolib.c:2352 gpiod_free.part.0+0x20/0x48 ... [ 9593.725016] Call trace: [ 9593.727468] gpiod_free.part.0+0x20/0x48 [ 9593.731404] gpiod_free+0x14/0x24 [ 9593.734728] lineevent_free+0x40/0x74 [ 9593.738402] lineevent_release+0x14/0x24 [ 9593.742335] __fput+0x70/0x2bc [ 9593.745403] __fput_sync+0x50/0x5c [ 9593.748817] __arm64_sys_close+0x38/0x7c [ 9593.752751] invoke_syscall+0x48/0x114 ... [ 9593.815299] ---[ end trace 0000000000000000 ]--- [ 9593.820616] hotplug-manager dock-hotplug-manager: remove overlay 0 (ovcs id 1) gpiomon: error waiting for events: No such device # --- 8< --- Best regards, Hervé