On Fri, Nov 24, 2023 at 5:00 PM Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > There are more instances of this pattern. This seems to be a way to > work around the fact that we have GPIO API functions that can be > called from atomic context (gpiod_set/get_value(), > gpiod_direction_input/output(), etc.) that in their implementation > call driver callbacks that may as well sleep (gc->set(), > gc->direction_output(), etc.). Correct, AFAIK this is why this looks like it does. > Protecting the list of GPIO devices is simple. It should be a mutex as > the list should never be modified from atomic context. If you are referring to gpio_lock then go back and look how that got where it is today: git checkout d2876d08d86f2 (initial gpiolib commit 2008) The ultimately confusing thing is that: 1. Yes it is protecting the list of gpio chips 2. The struct holding items in the list of the gpio chips is called gpio_desc... Then this comment: /* gpio_lock prevents conflicts during gpio_desc[] table updates. * While any GPIO is requested, its gpio_chip is not removable; * each GPIO's "requested" flag serves as a lock and refcount. */ Is cargo-culted to present day and is talking about gpio_desc, but nowadays gpio_desc means something completely different... OK I'll send a patch just deleting this comment, it looks insane. > This can be easily factored out right now. JustDoIt :) > Protecting GPIO descriptors is > trickier. If we use a spinlock for that, we'll run into problems with > GPIO drivers that can sleep. If we use a mutex, we'll have a problem > with users calling GPIO functions from atomic context. > > One idea I have is introducing a strict limit on which functions can > be used from atomic context (we don't enforce anything ATM in > functions that don't have the _cansleep suffix in their names) and > check which parts of the descriptor struct they modify. Then protect > these parts with a spinlock in very limited critical sections. Have a > mutex for everything else that can only be accessed from process > context. > > Another one is introducing strict APIs like gpiod_set_value_atomic() > that'll be designed to be called from atomic context exclusively and > be able to handle it. Everything else must only be called from process > context. This of course would be a treewide change as we'd need to > modify all GPIO calls in interrupt handlers. This is a much harder problem. Many of the current API functions can be called from atomic and nonatomic contexts alike :/ this has historical reasons of course. Back in 2008 most GPIO chips were just on-SoC and resulted in a register write: no problem. Now we have quite a bunch of GPIOs on I2C, SPI ... and the API looks the same. The _cansleep functions were supposed to be used explicitly in places where it is OK that the GPIO can sleep (as in: I don't care if you sleep or not), and every other site using the non-_cansleep versions should be where it has to be atomic. Every call where non-_cansleep is called but it doesn't matter is essentially a bug, they should be using _cansleep versions. (Oh boy ... such much bug.) In 2008 when it was introduced, *_cansleep had one single user: drivers/leds/led-gpio.c because it was assumed that users of that driver would not care if LEDs are on GPIO expanders or on SoC-resident GPIOs. Ironically that driver now keeps track of whether the GPIO is sleepable or not... So if you propose turning this on it's head by creating *_atomic and opt-in to atomic behaviour instead of opting out of it, I'd say yes, but only if we delete all uses of _cansleep at the same time and that means the default behaviour becomes _cansleep. Keeping both around at the same time is going to be a complete mess. Yours, Linus Walleij