Re: GPIOLIB locking is broken and how to fix it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 07, 2023 at 07:37:54PM +0100, Bartosz Golaszewski wrote:
> On Tue, Nov 28, 2023 at 11:47 AM Bartosz Golaszewski <brgl@xxxxxxxx> wrote:
> >
>
> [snip]
>
> >
> > Because years of technical debt, that's why. :)
> >
>
> Speaking of technical debt: you may have noticed that despite stating
> I'm almost done last week, I still haven't submitted my locking
> rework.
>
> The reason for that is that I'm stuck on some corner-cases related to
> the GPIO <-> pinctrl interaction. Specifically the fact that we have
> GPIOLIB API functions that may be called from atomic context which may
> end up calling into pinctrl where a mutex will be acquired.
>

To be clear, that is an existing pinctrl mutex?

> An example of that is any of the GPIO chips that don't set the
> can_sleep field in struct gpio_chip but still use
> gpiochip_generic_config() (e.g. tegra186). We can then encounter the
> following situation:
>
> irq_handler() // in atomic context
>   gpiod_direction_output() // line is open-drain
>     gpio_set_config()
>       gpiochip_generic_config()
>         pinctrl_gpio_set_config()
>           mutex_lock()
>

Isn't using a mutex (the pinctrl one here) from atomic always a
problem?  Shouldn't this flow be handed off to irq_thread()?

> Currently we don't take any locks nor synchronize in any other way
> (which is wrong as concurrent gpiod_direction_output() and
> gpiod_direction_input() will get in each other's way). Using a mutex
> will be fine but for non-sleeping chips if we use a spinlock here (no
> other choice really) we'll set off fireworks.
>
> One of the ideas I have is using the fact that we already use atomic
> bitops in most places. Let's not take locks but add a new flag:
> FLAG_SETTING_DIRECTION. Now when we go into
> gpiod_direction_output/input(), we test and set it. A subsequent call
> will fail with EBUSY or EAGAIN as long as it's set. It will have no
> effect on set/get() - any synchronization will be left to the driver.
> When we're done, we clear it after setting the relevant direction
> flag.
>
> Does this make any sense? There's still the label pointer and debounce
> period stored in the descriptor but these are not accessed in atomic
> context AFAICT.
>

Makes sense to me, as it is basically the sub-state solution I suggested
earlier for request/release, but applied to direction.  Not sure about
the contention behaviour though, as that is something userspace will
see and might not be expecting.

OTOH I'm starting to think that serialising callbacks might be a good idea
- unless it is crystal clear to the driver authors that the callbacks may
be called concurrently.

The debounce is really a cdev field.  Putting it in the desc made sense
to me at the time as it is per-line, not per-request, but perhaps it
should moved into the cdev linereq, even if that means having to alloc
space for it, just to get it out of your hair.

Cheers,
Kent.




[Index of Archives]     [Linux SPI]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux