Hi Bartosz, On 3/29/24 4:16 PM, Bartosz Golaszewski wrote: > On Fri, 29 Mar 2024 15:11:21 +0100, Hans de Goede <hans@xxxxxxxxx> said: >> Hi All, >> >> I've already tried to fix this, so let me just copy and paste my half finished patch >> to explain the problem. >> >> I was planning on submitting this as a RFC patch at least, but there are also some >> other new issues with 6.9 on this tablet and I'm not sure how this interacts >> with those issues and I don't have time to work on this any further this weekend. >> >> Anyways below is the patch / bug report. >> >> I'm wondering if a better fix would be to add a "ready" flag to gdev >> and may gpiochip_find ignore not yet ready chips (we need them on >> the list before they are ready to reserve the GPIO numbers) ? >> >> Regards, >> >> Hans >> > > Hi Hans! > > Thanks for the report. I hope I'm not being naive here but would the following > one-liner work? > > diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c > index ce94e37bcbee..69f365ccbfd8 100644 > --- a/drivers/gpio/gpiolib.c > +++ b/drivers/gpio/gpiolib.c > @@ -1179,7 +1179,7 @@ struct gpio_device *gpio_device_find(const void *data, > > gc = srcu_dereference(gdev->chip, &gdev->srcu); > > - if (gc && match(gc, data)) > + if (device_is_registered(&gdev->dev) && gc && match(gc, data)) > return gpio_device_get(gdev); > } > > This would make gpio_device_find() ignore any GPIO device that's not yet > registered on the GPIO bus which is almost the last step of the registration > process right before creating the sysfs attributes. Yes that should work and it has the added advantage that it also waits for things like the irqchip to be setup before gpio_device_find() will find the gpio-device. I cannot trigger the race every boot, but I do hit it quite regularly and with this change I've done 10 successful consecutive boots, so I believe that this indeed fixes the race. I've prepared a patch with this fix now which I'll send out shortly. As for Andy's suggestion I'm not all that familiar with the RCU stuff, but I think that if we were to go that route then the device_is_registered() check should be moved up to above the "guard(srcu)(&gdev->srcu);" line rather then above the "gc = srcu_deref..." line, since in that case we are not using the gdev->chip pointer at all if we bail ? Anyways for now I've just gone with your suggested 1 liner. Regards, Hans p.s. While looking into this I noticed one other possible problem, unless gpiochip_add_data_with_key() and gpiolib_dev_init() are guaranteed to never run at the same time then we may end up calling gpiochip_setup_dev() twice, once from gpiolib_dev_init() and once from gpiochip_add_data_with_key() when the 2 race.