Hi Hillf, On 6/2/24 4:30 AM, Hillf Danton wrote: > On Sat, 1 Jun 2024 21:55:28 +0200 Hans de Goede <hdegoede@xxxxxxxxxx> >> The input subsystem registers LEDs with default triggers while holding >> the input_lock and input_register_handler() takes the input_lock this >> means that a triggers activate method cannot directly call >> input_register_handler(). >> >> Move the calling of input_register_handler() to the already existing >> delayed-work to avoid this issue. >> >> Here is a lockdep from having the input-events trigger activated on a LED >> and then after that plugging in a USB keyboard: >> >> [ 2840.220145] usb 1-1.3: new low-speed USB device number 3 using xhci_hcd >> [ 2840.307172] usb 1-1.3: New USB device found, idVendor=0603, idProduct=0002, bcdDevice= 2.21 >> [ 2840.307375] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0 >> [ 2840.307423] usb 1-1.3: Product: USB Composite Device >> [ 2840.307456] usb 1-1.3: Manufacturer: SINO WEALTH >> [ 2840.333985] input: SINO WEALTH USB Composite Device as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1.3/1-1.3:1.0/0003:0603:0002.0007/input/input19 >> >> [ 2840.386545] ====================================================== >> [ 2840.386549] WARNING: possible circular locking dependency detected >> [ 2840.386554] 6.10.0-rc1+ #97 Tainted: G C E >> [ 2840.386558] ------------------------------------------------------ >> [ 2840.386562] kworker/1:1/52 is trying to acquire lock: >> [ 2840.386566] ffff98fcf1629300 (&led_cdev->led_access){+.+.}-{3:3}, at: led_classdev_register_ext+0x1c6/0x380 >> [ 2840.386590] >> but task is already holding lock: >> [ 2840.386593] ffffffff88130cc8 (input_mutex){+.+.}-{3:3}, at: input_register_device.cold+0x47/0x150 >> [ 2840.386608] >> which lock already depends on the new lock. >> >> [ 2840.386611] >> the existing dependency chain (in reverse order) is: >> [ 2840.386615] >> -> #3 (input_mutex){+.+.}-{3:3}: >> [ 2840.386624] __mutex_lock+0x8c/0xc10 >> [ 2840.386634] input_register_handler+0x1c/0xf0 >> [ 2840.386641] 0xffffffffc142c437 >> [ 2840.386655] led_trigger_set+0x1e1/0x2e0 >> [ 2840.386661] led_trigger_register+0x170/0x1b0 >> [ 2840.386666] do_one_initcall+0x5e/0x3a0 >> [ 2840.386675] do_init_module+0x60/0x220 >> [ 2840.386683] __do_sys_init_module+0x15f/0x190 >> [ 2840.386689] do_syscall_64+0x93/0x180 >> [ 2840.386696] entry_SYSCALL_64_after_hwframe+0x76/0x7e >> [ 2840.386705] >> -> #2 (&led_cdev->trigger_lock){+.+.}-{3:3}: >> [ 2840.386714] down_write+0x3b/0xd0 >> [ 2840.386720] led_trigger_register+0x12c/0x1b0 >> [ 2840.386725] rfkill_register+0xec/0x340 [rfkill] >> [ 2840.386739] wiphy_register+0x82a/0x930 [cfg80211] >> [ 2840.386907] brcmf_cfg80211_attach+0xcbd/0x1430 [brcmfmac] >> [ 2840.386952] brcmf_attach+0x1ba/0x4c0 [brcmfmac] >> [ 2840.386991] brcmf_pcie_setup+0x899/0xc70 [brcmfmac] >> [ 2840.387030] brcmf_fw_request_done+0x13b/0x180 [brcmfmac] >> [ 2840.387070] request_firmware_work_func+0x3b/0x70 >> [ 2840.387078] process_one_work+0x21a/0x590 >> [ 2840.387085] worker_thread+0x1d1/0x3e0 >> [ 2840.387090] kthread+0xee/0x120 >> [ 2840.387096] ret_from_fork+0x30/0x50 >> [ 2840.387105] ret_from_fork_asm+0x1a/0x30 >> [ 2840.387112] >> -> #1 (leds_list_lock){++++}-{3:3}: >> [ 2840.387123] down_write+0x3b/0xd0 >> [ 2840.387129] led_classdev_register_ext+0x29e/0x380 >> [ 2840.387134] 0xffffffffc0e6b74c >> [ 2840.387143] platform_probe+0x40/0xa0 >> [ 2840.387151] really_probe+0xde/0x340 >> [ 2840.387157] __driver_probe_device+0x78/0x110 >> [ 2840.387162] driver_probe_device+0x1f/0xa0 >> [ 2840.387168] __driver_attach+0xba/0x1c0 >> [ 2840.387173] bus_for_each_dev+0x6b/0xb0 >> [ 2840.387180] bus_add_driver+0x111/0x1f0 >> [ 2840.387185] driver_register+0x6e/0xc0 >> [ 2840.387191] do_one_initcall+0x5e/0x3a0 >> [ 2840.387197] do_init_module+0x60/0x220 >> [ 2840.387204] __do_sys_init_module+0x15f/0x190 >> [ 2840.387210] do_syscall_64+0x93/0x180 >> [ 2840.387217] entry_SYSCALL_64_after_hwframe+0x76/0x7e >> [ 2840.387224] >> -> #0 (&led_cdev->led_access){+.+.}-{3:3}: >> [ 2840.387233] __lock_acquire+0x11c6/0x1f20 >> [ 2840.387239] lock_acquire+0xc8/0x2b0 >> [ 2840.387244] __mutex_lock+0x8c/0xc10 >> [ 2840.387251] led_classdev_register_ext+0x1c6/0x380 >> [ 2840.387256] input_leds_connect+0x139/0x260 >> [ 2840.387262] input_attach_handler.isra.0+0x75/0x90 >> [ 2840.387268] input_register_device.cold+0xa1/0x150 >> [ 2840.387274] hidinput_connect+0x848/0xb00 >> [ 2840.387280] hid_connect+0x567/0x5a0 >> [ 2840.387288] hid_hw_start+0x3f/0x60 >> [ 2840.387294] hid_device_probe+0x10d/0x190 >> [ 2840.387298] really_probe+0xde/0x340 >> [ 2840.387304] __driver_probe_device+0x78/0x110 >> [ 2840.387309] driver_probe_device+0x1f/0xa0 >> [ 2840.387314] __device_attach_driver+0x85/0x110 >> [ 2840.387320] bus_for_each_drv+0x78/0xc0 >> [ 2840.387326] __device_attach+0xb0/0x1b0 >> [ 2840.387332] bus_probe_device+0x94/0xb0 >> [ 2840.387337] device_add+0x64a/0x860 >> [ 2840.387343] hid_add_device+0xe5/0x240 >> [ 2840.387349] usbhid_probe+0x4bb/0x600 >> [ 2840.387356] usb_probe_interface+0xea/0x2b0 >> [ 2840.387363] really_probe+0xde/0x340 >> [ 2840.387368] __driver_probe_device+0x78/0x110 >> [ 2840.387373] driver_probe_device+0x1f/0xa0 >> [ 2840.387378] __device_attach_driver+0x85/0x110 >> [ 2840.387383] bus_for_each_drv+0x78/0xc0 >> [ 2840.387390] __device_attach+0xb0/0x1b0 >> [ 2840.387395] bus_probe_device+0x94/0xb0 >> [ 2840.387400] device_add+0x64a/0x860 >> [ 2840.387405] usb_set_configuration+0x5e8/0x880 >> [ 2840.387411] usb_generic_driver_probe+0x3e/0x60 >> [ 2840.387418] usb_probe_device+0x3d/0x120 >> [ 2840.387423] really_probe+0xde/0x340 >> [ 2840.387428] __driver_probe_device+0x78/0x110 >> [ 2840.387434] driver_probe_device+0x1f/0xa0 >> [ 2840.387439] __device_attach_driver+0x85/0x110 >> [ 2840.387444] bus_for_each_drv+0x78/0xc0 >> [ 2840.387451] __device_attach+0xb0/0x1b0 >> [ 2840.387456] bus_probe_device+0x94/0xb0 >> [ 2840.387461] device_add+0x64a/0x860 >> [ 2840.387466] usb_new_device.cold+0x141/0x38f >> [ 2840.387473] hub_event+0x1166/0x1980 >> [ 2840.387479] process_one_work+0x21a/0x590 >> [ 2840.387484] worker_thread+0x1d1/0x3e0 >> [ 2840.387488] kthread+0xee/0x120 >> [ 2840.387493] ret_from_fork+0x30/0x50 >> [ 2840.387500] ret_from_fork_asm+0x1a/0x30 >> [ 2840.387506] >> other info that might help us debug this: >> >> [ 2840.387509] Chain exists of: >> &led_cdev->led_access --> &led_cdev->trigger_lock --> input_mutex >> >> [ 2840.387520] Possible unsafe locking scenario: >> >> [ 2840.387523] CPU0 CPU1 >> [ 2840.387526] ---- ---- >> [ 2840.387529] lock(input_mutex); >> [ 2840.387534] lock(&led_cdev->trigger_lock); >> [ 2840.387540] lock(input_mutex); >> [ 2840.387545] lock(&led_cdev->led_access); >> [ 2840.387550] >> *** DEADLOCK *** >> > Given the locking order > > input_mutex > leds_list_lock > input_mutex > leds_list_lock >> >> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> >> --- >> drivers/leds/trigger/ledtrig-input-events.c | 40 ++++++++++++++------- >> 1 file changed, 28 insertions(+), 12 deletions(-) >> >> diff --git a/drivers/leds/trigger/ledtrig-input-events.c b/drivers/leds/trigger/ledtrig-input-events.c >> index 94d5580ea093..813b5782b2d2 100644 >> --- a/drivers/leds/trigger/ledtrig-input-events.c >> +++ b/drivers/leds/trigger/ledtrig-input-events.c >> @@ -6,6 +6,7 @@ >> * Partially based on Atsushi Nemoto's ledtrig-heartbeat.c. >> */ >> >> +#include <linux/dev_printk.h> >> #include <linux/input.h> >> #include <linux/jiffies.h> >> #include <linux/leds.h> >> @@ -19,6 +20,8 @@ >> >> /* To avoid repeatedly setting the brightness while there are events */ >> #define FL_LED_ON 0 >> +#define FL_REGISTER_HANDLER 1 >> +#define FL_HANDLER_REGISTERED 2 >> >> struct input_events_data { >> struct input_handler handler; >> @@ -35,6 +38,16 @@ static void led_input_events_work(struct work_struct *work) >> { >> struct input_events_data *data = >> container_of(work, struct input_events_data, work.work); >> + int ret; >> + >> + /* Handler gets registered here (from work) to avoid locking issues */ >> + if (test_and_clear_bit(FL_REGISTER_HANDLER, &data->flags)) { >> + ret = input_register_handler(&data->handler); >> + if (ret) >> + dev_err(data->led_cdev->dev, "Failed to register input handler\n"); >> + else >> + set_bit(FL_HANDLER_REGISTERED, &data->flags); >> + } >> >> spin_lock_irq(&data->lock); >> >> @@ -164,7 +177,6 @@ static const struct input_device_id input_events_ids[] = { >> static int input_events_activate(struct led_classdev *led_cdev) >> { >> struct input_events_data *data; >> - int ret; >> >> data = kzalloc(sizeof(*data), GFP_KERNEL); >> if (!data) >> @@ -181,6 +193,7 @@ static int input_events_activate(struct led_classdev *led_cdev) >> >> data->led_cdev = led_cdev; >> data->led_cdev_saved_flags = led_cdev->flags; >> + data->led_off_time = jiffies; >> data->led_off_delay = msecs_to_jiffies(DEFAULT_LED_OFF_DELAY_MS); >> >> /* >> @@ -191,20 +204,15 @@ static int input_events_activate(struct led_classdev *led_cdev) >> if (!led_cdev->blink_brightness) >> led_cdev->blink_brightness = led_cdev->max_brightness; >> >> - /* Start with LED off */ >> - led_set_brightness_nosleep(data->led_cdev, LED_OFF); >> - >> - ret = input_register_handler(&data->handler); >> - if (ret) { >> - kfree(data); >> - return ret; >> - } >> - >> set_bit(LED_BLINK_SW, &led_cdev->work_flags); >> >> /* Turn LED off during suspend, original flags are restored on deactivate() */ >> led_cdev->flags |= LED_CORE_SUSPENDRESUME; >> >> + /* Handler gets registered from work to avoid locking issues */ >> + set_bit(FL_REGISTER_HANDLER, &data->flags); >> + queue_delayed_work(system_wq, &data->work, 0); >> + >> led_set_trigger_data(led_cdev, data); >> return 0; >> } >> @@ -213,10 +221,18 @@ static void input_events_deactivate(struct led_classdev *led_cdev) >> { >> struct input_events_data *data = led_get_trigger_data(led_cdev); >> >> + /* >> + * Ensure work queued from activate() has registered the handler >> + * before unregistering it. >> + */ >> + flush_delayed_work(&data->work); >> + if (test_bit(FL_HANDLER_REGISTERED, &data->flags)) >> + input_unregister_handler(&data->handler); >> + > > this change ends up with the deadlock intact. Right I already mentioned this in the cover-letter, this is why this series is RFC. I mostly still posted this upstream despite it not being a proper fix to document the troubles of the complex problem of LEDs or LED triggers getting registered by subsystems while holding a global lock from that subsystem vs the activate / deactivate method of the trigger calling functions of that same subsystem which require that same global lock. I wanted to document my attempt at fixing this because this is basically the same LED trigger problem as this 6.9+ regression: https://lore.kernel.org/regressions/42d498fc-c95b-4441-b81a-aee4237d1c0d@xxxxxxxxxxxxx/ which is a somewhat more complex version of the same base issue. >> + cancel_delayed_work_sync(&data->work); >> + >> led_cdev->flags = data->led_cdev_saved_flags; >> clear_bit(LED_BLINK_SW, &led_cdev->work_flags); >> - input_unregister_handler(&data->handler); >> - cancel_delayed_work_sync(&data->work); >> kfree(data); >> } >> >> -- >> 2.45.1 > > One option looks like adding .pre_deactivate cb to trigger, and invoking it > in the deactivate path. It breaks the lock chain above by taking input_mutex > outside leds_list_lock. Thank you for the suggestion. As I also mentioned in the cover-letter (and have implemented in the mean time), I have fixed this by moving to using a simple input_handler registered at module_init() time + calling input_trigger_event() on the trigger to set the brightness of LEDs which have this trigger active. Compared to the old code this looses the ability for the user to configure a different brightness for the on state then LED_FULL, but that is something which I can live with and since this trigger is only in for-leds-next atm loosing that functionality is not a regression. I'll post this fix for the locking issue to the list soon. Regards, Hans > --- a/drivers/leds/led-triggers.c > +++ b/drivers/leds/led-triggers.c > @@ -323,6 +323,9 @@ void led_trigger_unregister(struct led_t > list_del_init(&trig->next_trig); > up_write(&triggers_list_lock); > > + if (trig->pre_deactivate) > + trig->pre_deactivate(trig); > + > /* Remove anyone actively using this trigger */ > down_read(&leds_list_lock); > list_for_each_entry(led_cdev, &leds_list, node) { >