Hello Bart, On Sun, Jul 30, 2023 at 12:07:33PM +0200, Bartosz Golaszewski wrote: > On Sat, Jul 29, 2023 at 11:37 PM Uwe Kleine-König > <u.kleine-koenig@xxxxxxxxxxxxxx> wrote: > > > > Hello Bartosz, > > > > On Sat, Jul 29, 2023 at 04:09:40PM +0200, Bartosz Golaszewski wrote: > > > Looks good to me (although I have my reservations about the concept of > > > foo_alloc() for subsystems in the kernel...). > > > > Wolfram's EOSS talk[1] mentioned "__cleanup__ + kref as suggested by Bartosz? > > Paradigm shift, probably looong way to go". I guess that's what you'd > > prefer? Do you have a link for me to read about this? > > > > For now I prefer the gpiolib model. One structure allocated and > controlled by the driver (struct gpio_chip) which needs to live only > as long as the device is bound to a driver and a second structure > private to the subsystem, allocated and controlled by the subsystem > (struct gpio_device) which also contains the referenced counted struct > device and is only released by the device's release callback. The issue I want to fix for pwm (but don't know yet how to do) is: What should happen to PWMs that are requested by a consumer when the PWM driver goes away. I looked into how gpio does it, and I think the "solution" there is: dev_crit(&gdev->dev, "REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED\n"); introduced in e1db1706c86e ("gpio: gpiolib: set gpiochip_remove retval to void"). (But the problem is actually older because returning -EBUSY as done before is bad, too) I'd hope this could be done better?! While trying to understand how gpio works, I found a few issues that are (I think) fixable with the gpiolib model: - gpiochip_add_data_with_key() calls device_initialize(&gdev->dev) and has later error paths that don't do device_put() but kfree gdev. - the locking scheme in gpiod_request_commit() looks strange. gpio_lock is released and retaken possibly several times. I wonder what it actually protects there. Maybe doing diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index edab00c9cb3c..496b1cebba58 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -2064,13 +2064,11 @@ static int gpiod_request_commit(struct gpio_desc *desc, const char *label) goto out_free_unlock; } } + spin_unlock_irqrestore(&gpio_lock, flags); if (gc->get_direction) { /* gc->get_direction may sleep */ - spin_unlock_irqrestore(&gpio_lock, flags); gpiod_get_direction(desc); - spin_lock_irqsave(&gpio_lock, flags); } - spin_unlock_irqrestore(&gpio_lock, flags); return 0; out_free_unlock: simplifies the code and given that gpiod_get_direction() rechecks gc->get_direction unlocked I don't think we'd loose anything here. - there is a race condition: gpiochip_remove() takes &gdev->sem when invalidating gdev->chip and calling gpiochip_set_data(), but the various gpio API functions calling VALIDATE_DESC_VOID don't hold &gdev->sem, so gpiochip_remove() might clean gdev->chip just between a consumer calling VALIDATE_DESC_VOID(desc) and WARN_ON(desc->gdev->chip->can_sleep) (e.g. in gpiod_set_value). > IMO there shouldn't be any need for PWM drivers to dereference struct > device held by struct pwm_chip. If anything - it should be passed to > the drivers in subsystem callbacks. I don't understand this. I think we agree that a PWM driver shouldn't have to care about the devices's lifetimes. It's difficult enough to get this right on the subsystem level. > I may be wrong of course, I don't know this subsystem very well but it > seems to follow a pattern that's pretty common in the kernel and > causes ownership confusion. Yes that's common. I think another thing that's common though is that device lifetime isn't properly handled, and while I don't consider myself as an expert here, the above makes me consider that gpio is no exception here. So I doubt it serves as a good example to copy from. Having said that I think the ..._alloc approach is easy enough for subsystem drivers. Also for pwm we only need a devm_... variant, so getting the driver part right is really easy. And given that ..._alloc makes it easier for a subsystem core to do things right (as it only has to handle a single data structure that lives long enough) that's what I did here. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ |
Attachment:
signature.asc
Description: PGP signature