Re: gpiod_set_value(): BUG: sleeping function called from invalid context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/12/2021 11:45, Hans Verkuil wrote:
> On 05/12/2021 01:18, Linus Walleij wrote:
>> On Wed, Dec 1, 2021 at 10:40 AM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>
>>> [ 1674.787685]  ___might_sleep+0x148/0x180
>>> [ 1674.791572]  __might_sleep+0x54/0x90
>>> [ 1674.795195]  mutex_lock+0x28/0x80
>>> [ 1674.798556]  pinctrl_get_device_gpio_range+0x3c/0x110
>>
>> There is the error ^
>>
>>> gpiod_set_value() is supposed to be usable from an atomic context, so this
>>> appears to be a bug. It's probably been there for quite a long time. I suspect
>>> OPEN_DRAIN isn't very common, and I think this might be the first time I tested
>>> this driver with this kernel config option set.
>>
>> Nah has nothing to do with open drain :)
>>
>>> Any suggestions?
>>
>> pinctrl_get_device_gpio_range() needs to be modified to not take a mutex
>> because mutex:es can sleep.
>>
>> static int pinctrl_get_device_gpio_range(unsigned gpio,
>>                                          struct pinctrl_dev **outdev,
>>                                          struct pinctrl_gpio_range **outrange)
>> {
>>         struct pinctrl_dev *pctldev;
>>
>>         mutex_lock(&pinctrldev_list_mutex);
>>
>> BLAM!
>>
>> And this definitely needs to be called on this path so no way out
>> of that.
> 
> Any idea why this hasn't been seen/reported before? According to git blame
> that mutex_lock has been there since 2013. Does nobody test with
> CONFIG_DEBUG_ATOMIC_SLEEP? Am I really the first to encounter this?
> 
> Before spending time on this I'd like to be sure that, yes, I'm really the
> first in 8 years to see this, and this wasn't introduced by something else.
> 
>>
>> This mutex pinctrldev_list_mutex is there to protect from devices coming
>> and going as we look for devices with ranges on.
>>
>> One way to solve it would be to simply replace it with a spinlock, maybe
>> that works? (Check the code carefully so there are no things like calls
>> into drivers that may fire interrupts etc...)
>>
>> If it still need to be sleepable (mutex-ish) you need to convert it to
>> use RCU I think? (Which would be pretty cool anyway.)
> 
> RCU seems like a reasonable alternative, but I will have to read up on it
> since it's the first time I'll be using this. All those quizzes in the RCU
> docs (1) scare the hell out of me :-)

It's worse: pinctrl_get_device_gpio_range() takes pinctrldev_list_mutex, but
in the loop it calls pinctrl_match_gpio_range() which calls
mutex_lock(&pctldev->mutex). So it's two mutexes that this function takes from
what is supposed to be atomic context.

In addition, pinctrl_get_device_gpio_range() is called from pinctrl_gpio_direction(),
which also takes pctldev->mutex afterwards.

I'm not sure if the pctldev->mutex can be replaced with rcu, I suspect not.

For something that's supposed to be atomic, there seem to be an awful lot of mutexes...

In the case of the bcm2711 the bcm2835_gpio_direction_output() function calls
pinctrl_gpio_direction_output, which in turn (via pinctrl_gpio_direction()
and pinmux_gpio_direction()) calls bcm2835_pmx_gpio_set_direction().

It appears to me that bcm2835_gpio_direction_output() can call
bcm2835_pmx_gpio_set_direction() directly, thus avoiding all the mutexes.
That's exactly what samsung/pinctrl-samsung.c and qcom/pinctrl-msm.c do,
from what I can tell.

Are pinctrl drivers supposed to call pinctrl_gpio_direction_output() from
the direction_output() op? Isn't that perhaps the bug?

Regards,

	Hans

> 
> Regards,
> 
> 	Hans
> 
> (1) https://www.kernel.org/doc/html/latest/RCU/listRCU.html
> 
>>
>> Yours,
>> Linus Walleij
>>
> 




[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux