Hi Johannes, Thank you for the quick reply. On 4/11/23 21:38, Johannes Berg wrote: > Hi, > >> Sorry to bring the bearer of bad news, but your commit 2a5a8fa8b23 >> ("leds: trigger: use RCU to protect the led_cdevs list"), >> is causing the following RCU warning when used with blinking >> triggers on I2C LED controllers which support hw blinking. > > Err, well, surely that is a pre-existing driver bug then? So I just checked and the following LED drivers all have a blink_set() implementation which calls mutex_lock() and/or does I2C transfers: leds-an30259a.c leds-aw2013.c leds-bd2802.c leds-lp3944.c leds-pca9532.c leds-pca963x.c leds-wm831x-status.c And I only found one i2c LED controller driver which defers I2C transfers to a workqueue itself (leds-tca6507.c). And looking at: include/linux/leds.h Then the brightness_set callback is explicitly marked as "Must not sleep." (and there is a brightness_set_blocking which e.g. I2C drivers can use instead). So I believe that the intention has always been that a driver's blink_set callback is allowed to sleep. With that said you seem to be right that there seems to be a long standing bug where led_trigger_blink[_oneshot] calls led_classdev.blink_set() in a context where it may not sleep. But that is more of a LED (trigger) core bug then an issue with the driver(s). >> The specific problem is drivers/leds/led-triggers.c: >> led_trigger_blink_setup() which does: >> >> rcu_read_lock(); >> list_for_each_entry_rcu(led_cdev, &trig->led_cdevs, trig_list) { >> if (oneshot) >> led_blink_set_oneshot(led_cdev, delay_on, delay_off, >> invert); >> else >> led_blink_set(led_cdev, delay_on, delay_off); >> } >> rcu_read_unlock(); >> >> And that led_blink_set() call then hits this path: >> >> if (!test_bit(LED_BLINK_ONESHOT, &led_cdev->work_flags) && >> led_cdev->blink_set && >> !led_cdev->blink_set(led_cdev, delay_on, delay_off)) >> return; >> >> Which calls directly into the LED controller driver > > Sure, so far so good. > >> which talks to the LED controller over I2C which may sleep. > > Which seems to me was already wrong before my patch, since the code was: > > read_lock_irqsave(&trig->leddev_list_lock, flags); > list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list) { > if (oneshot) > led_blink_set_oneshot(led_cdev, delay_on, delay_off, > invert); > else > led_blink_set(led_cdev, delay_on, delay_off); > } > read_unlock_irqrestore(&trig->leddev_list_lock, flags); > > > Surely, the code wasn't allowed to sleep in an _irqsave() section? You'd > just see a different check complain, rather than about RCU, I guess. Hmm, so the irqsave part of this was introduced by commit 27af8e2c90fb ("leds: trigger: fix potential deadlock with libata") but even before then I think sleeping here was not allowed, given that an rwlock is a spinlock variant the non irqsave version also leads to a section where sleeping is not allowed I believe. Further git archeology seems to indicate that this problem has existed for a long long time already. I guess I'm the first user of a trigger which calls led_trigger_blink[_oneshot] on a led_classdev with a hw blink_set() implementation which sleeps. Or at least I'm the first user to do so with various lock-debugging options enabled ... > So maybe you're bringing bad news, but I don't think it's for me ;-) Yes it seems I have unearthed a bug which is much older then your RCU changes. > I don't see cht_wc_leds_brightness_get() or a driver/module called > leds_cht_wcove even in linux-next, so I guess you should look wherever > you got _that_ from. Right that is part of a new LED driver I'm working on. I guess I could fix things in the driver, but this seems to be a (trigger) core bug so lets try to solve it there. Regards, Hans