Re: "leds: trigger: use RCU to protect the led_cdevs list" triggers RCU error checks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Johannes,

Thank you for the quick reply.

On 4/11/23 21:38, Johannes Berg wrote:
> Hi,
> 
>> Sorry to bring the bearer of bad news, but your commit 2a5a8fa8b23
>> ("leds: trigger: use RCU to protect the led_cdevs list"),
>> is causing the following RCU warning when used with blinking
>> triggers on I2C LED controllers which support hw blinking.
> 
> Err, well, surely that is a pre-existing driver bug then?

So I just checked and the following LED drivers all have
a blink_set() implementation which calls mutex_lock()
and/or does I2C transfers:

leds-an30259a.c
leds-aw2013.c
leds-bd2802.c
leds-lp3944.c
leds-pca9532.c
leds-pca963x.c
leds-wm831x-status.c

And I only found one i2c LED controller driver which defers
I2C transfers to a workqueue itself (leds-tca6507.c).

And looking at: include/linux/leds.h

Then the brightness_set callback is explicitly marked as
"Must not sleep." (and there is a brightness_set_blocking
which e.g. I2C drivers can use instead).

So I believe that the intention has always been that
a driver's blink_set callback is allowed to sleep.

With that said you seem to be right that there seems to
be a long standing bug where led_trigger_blink[_oneshot]
calls led_classdev.blink_set() in a context where it may
not sleep.

But that is more of a LED (trigger) core bug then an
issue with the driver(s).

>> The specific problem is drivers/leds/led-triggers.c:
>> led_trigger_blink_setup() which does:
>>
>>         rcu_read_lock();
>>         list_for_each_entry_rcu(led_cdev, &trig->led_cdevs, trig_list) {
>>                 if (oneshot)
>>                         led_blink_set_oneshot(led_cdev, delay_on, delay_off,
>>                                               invert);
>>                 else
>>                         led_blink_set(led_cdev, delay_on, delay_off);
>>         }
>>         rcu_read_unlock();
>>
>> And that led_blink_set() call then hits this path:
>>
>>         if (!test_bit(LED_BLINK_ONESHOT, &led_cdev->work_flags) &&
>>             led_cdev->blink_set &&
>>             !led_cdev->blink_set(led_cdev, delay_on, delay_off))
>>                 return;
>>
>> Which calls directly into the LED controller driver
> 
> Sure, so far so good.
> 
>> which talks to the LED controller over I2C which may sleep.
> 
> Which seems to me was already wrong before my patch, since the code was:
> 
>        read_lock_irqsave(&trig->leddev_list_lock, flags);
>        list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list) {
>                if (oneshot)
>                        led_blink_set_oneshot(led_cdev, delay_on, delay_off,
>                                              invert);
>                else
>                        led_blink_set(led_cdev, delay_on, delay_off);
>        }
>        read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> 
> 
> Surely, the code wasn't allowed to sleep in an _irqsave() section? You'd
> just see a different check complain, rather than about RCU, I guess.

Hmm, so the irqsave part of this was introduced by commit 27af8e2c90fb
("leds: trigger: fix potential deadlock with libata") but even before
then I think sleeping here was not allowed, given that an rwlock is
a spinlock variant the non irqsave version also leads to a section
where sleeping is not allowed I believe.

Further git archeology seems to indicate that this problem has existed
for a long long time already. I guess I'm the first user of a trigger
which calls led_trigger_blink[_oneshot] on a led_classdev with
a hw blink_set() implementation which sleeps. Or at least I'm
the first user to do so with various lock-debugging options
enabled ...

> So maybe you're bringing bad news, but I don't think it's for me ;-)

Yes it seems I have unearthed a bug which is much older then your
RCU changes.

> I don't see cht_wc_leds_brightness_get() or a driver/module called
> leds_cht_wcove even in linux-next, so I guess you should look wherever
> you got _that_ from.

Right that is part of a new LED driver I'm working on. I guess I could
fix things in the driver, but this seems to be a (trigger) core bug
so lets try to solve it there.

Regards,

Hans





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux