Hi Bartosz, On Thu, 22 Feb 2024 00:31:08 -0800 Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > On Thu, 22 Feb 2024 02:05:30 +0100, Kent Gibson <warthog618@xxxxxxxxx> said: > > On Thu, Feb 22, 2024 at 08:57:44AM +0800, Kent Gibson wrote: > >> On Tue, Feb 20, 2024 at 10:29:59PM +0800, Kent Gibson wrote: > >> > On Tue, Feb 20, 2024 at 12:10:18PM +0100, Herve Codina wrote: > >> > >> ... > >> > >> > > } > >> > > > >> > > +static int linereq_unregistered_notify(struct notifier_block *nb, > >> > > + unsigned long action, void *data) > >> > > +{ > >> > > + struct linereq *lr = container_of(nb, struct linereq, > >> > > + device_unregistered_nb); > >> > > + int i; > >> > > + > >> > > + for (i = 0; i < lr->num_lines; i++) { > >> > > + if (lr->lines[i].desc) > >> > > + edge_detector_stop(&lr->lines[i]); > >> > > + } > >> > > + > >> > > >> > Firstly, the re-ordering in the previous patch creates a race, > >> > as the NULLing of the gdev->chip serves to numb the cdev ioctls, so > >> > there is now a window between the notifier being called and that numbing, > >> > during which userspace may call linereq_set_config() and re-request > >> > the irq. > >> > > >> > There is also a race here with linereq_set_config(). That can be prevented > >> > by holding the lr->config_mutex - assuming the notifier is not being called > >> > from atomic context. > >> > > >> > >> It occurs to me that the fixed reordering in patch 1 would place > >> the notifier call AFTER the NULLing of the ioctls, so there will no longer > >> be any chance of a race with linereq_set_config() - so holding the > >> config_mutex semaphore is not necessary. > >> > > > > NULLing -> numbing > > > > The gdev->chip is NULLed, so the ioctls are numbed. > > And I need to let the coffee soak in before sending. > > > >> In which case this patch is fine - it is only patch 1 that requires > >> updating. > >> > >> Cheers, > >> Kent. > > > > The fix for the user-space issue may be more-or-less correct but the problem is > deeper and this won't fix it for in-kernel users. > > Herve: please consider the following DT snippet: > > gpio0 { > compatible = "foo"; > > gpio-controller; > #gpio-cells = <2>; > interrupt-controller; > #interrupt-cells = <1>; > ngpios = <8>; > }; > > consumer { > compatible = "bar"; > > interrupts-extended = <&gpio0 0>; > }; > > If you unbind the "gpio0" device after the consumer requested the interrupt, > you'll get the same splat. And device links will not help you here (on that > note: Saravana: is there anything we could do about it? Have you even > considered making the irqchip subsystem use the driver model in any way? Is it > even feasible?). > > I would prefer this to be fixed at a lower lever than the GPIOLIB character > device. I think this use case is covered. When the consumer device related to the consumer DT node is added, a consumer/supplier relationship is created: parse_interrupts() parses the 'interrups-extended' property https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/of/property.c#L1316 and so, of_link_to_phandle() creates the consumer/supplier link. https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/of/property.c#L1316 We that link present, if the supplier is removed, the consumer is removed before. The consumer should release the interrupt during its remove process (i.e explicit in its .remove() or explicit because of a devm_*() call). At least, it is my understanding. Best regards, Hervé