On 05/12/2021 11:45, Hans Verkuil wrote: > On 05/12/2021 01:18, Linus Walleij wrote: >> On Wed, Dec 1, 2021 at 10:40 AM Hans Verkuil <hverkuil@xxxxxxxxx> wrote: >> >>> [ 1674.787685] ___might_sleep+0x148/0x180 >>> [ 1674.791572] __might_sleep+0x54/0x90 >>> [ 1674.795195] mutex_lock+0x28/0x80 >>> [ 1674.798556] pinctrl_get_device_gpio_range+0x3c/0x110 >> >> There is the error ^ >> >>> gpiod_set_value() is supposed to be usable from an atomic context, so this >>> appears to be a bug. It's probably been there for quite a long time. I suspect >>> OPEN_DRAIN isn't very common, and I think this might be the first time I tested >>> this driver with this kernel config option set. >> >> Nah has nothing to do with open drain :) >> >>> Any suggestions? >> >> pinctrl_get_device_gpio_range() needs to be modified to not take a mutex >> because mutex:es can sleep. >> >> static int pinctrl_get_device_gpio_range(unsigned gpio, >> struct pinctrl_dev **outdev, >> struct pinctrl_gpio_range **outrange) >> { >> struct pinctrl_dev *pctldev; >> >> mutex_lock(&pinctrldev_list_mutex); >> >> BLAM! >> >> And this definitely needs to be called on this path so no way out >> of that. > > Any idea why this hasn't been seen/reported before? According to git blame > that mutex_lock has been there since 2013. Does nobody test with > CONFIG_DEBUG_ATOMIC_SLEEP? Am I really the first to encounter this? > > Before spending time on this I'd like to be sure that, yes, I'm really the > first in 8 years to see this, and this wasn't introduced by something else. > >> >> This mutex pinctrldev_list_mutex is there to protect from devices coming >> and going as we look for devices with ranges on. >> >> One way to solve it would be to simply replace it with a spinlock, maybe >> that works? (Check the code carefully so there are no things like calls >> into drivers that may fire interrupts etc...) >> >> If it still need to be sleepable (mutex-ish) you need to convert it to >> use RCU I think? (Which would be pretty cool anyway.) > > RCU seems like a reasonable alternative, but I will have to read up on it > since it's the first time I'll be using this. All those quizzes in the RCU > docs (1) scare the hell out of me :-) It's worse: pinctrl_get_device_gpio_range() takes pinctrldev_list_mutex, but in the loop it calls pinctrl_match_gpio_range() which calls mutex_lock(&pctldev->mutex). So it's two mutexes that this function takes from what is supposed to be atomic context. In addition, pinctrl_get_device_gpio_range() is called from pinctrl_gpio_direction(), which also takes pctldev->mutex afterwards. I'm not sure if the pctldev->mutex can be replaced with rcu, I suspect not. For something that's supposed to be atomic, there seem to be an awful lot of mutexes... In the case of the bcm2711 the bcm2835_gpio_direction_output() function calls pinctrl_gpio_direction_output, which in turn (via pinctrl_gpio_direction() and pinmux_gpio_direction()) calls bcm2835_pmx_gpio_set_direction(). It appears to me that bcm2835_gpio_direction_output() can call bcm2835_pmx_gpio_set_direction() directly, thus avoiding all the mutexes. That's exactly what samsung/pinctrl-samsung.c and qcom/pinctrl-msm.c do, from what I can tell. Are pinctrl drivers supposed to call pinctrl_gpio_direction_output() from the direction_output() op? Isn't that perhaps the bug? Regards, Hans > > Regards, > > Hans > > (1) https://www.kernel.org/doc/html/latest/RCU/listRCU.html > >> >> Yours, >> Linus Walleij >> >