Re: Converting dev->mutex into dev->spinlock ?

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Sun, 5 Feb 2023 01:12:12 +0900

On 2023/02/05 0:34, Alan Stern wrote:
>> A few of examples:
>>
>>   https://syzkaller.appspot.com/bug?extid=2d6ac90723742279e101
> 
> It's hard to figure out what's wrong from looking at the syzbot report.  
> What makes you think it is connected with dev->mutex?
> 
> At first glance, it seems that the ath6kl driver is trying to flush a 
> workqueue while holding a lock or mutex that is needed by one of the 
> jobs in the workqueue.  That's obviously never going to work, no matter 
> what sort of lockdep validation gets used.

That lock is exactly dev->mutex where lockdep validation is disabled.
If lockdep validation on dev->mutex were not disabled, we can catch
possibility of deadlock before khungtaskd reports real deadlock as hung.

Lockdep validation on dev->mutex being disabled is really annoying, and
I want to make lockdep validation on dev->mutex enabled; that is the
"drivers/core: Remove lockdep_set_novalidate_class() usage" patch.

-------- Forwarded Message --------
Message-ID: <5e4d20a0-08a3-9736-b6ef-cda00acca63f@xxxxxxxxxxxxxxxxxxx>
Date: Sun, 3 Jul 2022 23:29:16 +0900
From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: INFO: task hung in ath6kl_usb_power_off

I finally found why lockdep was not able to report this deadlock.
device_initialize() was hiding dev->mutex from lockdep tests.
I hope we can get rid of this lockdep_set_novalidate_class()...

Below is a reproducer kernel module. If request_firmware_nowait() called

  INIT_WORK(&fw_work->work, request_firmware_work_func);
  schedule_work(&fw_work->work);

before hub_event() calls

  usb_lock_device(hdev);

, flush_scheduled_work() from hub_event() becomes responsible for flushing
fw_work->work. But flush_scheduled_work() can't flush fw_work->work because
dev->mutex is held by hub_event().

----------------------------------------
#include <linux/module.h>
#include <linux/sched.h>

static DEFINE_MUTEX(mutex);

static void request_firmware_work_func(struct work_struct *work)
{
	schedule_timeout_uninterruptible(HZ); // inject race window for allowing hub_event() to find this work
	mutex_lock(&mutex); // device_lock(parent) from ath9k_hif_usb_firmware_fail() waits for ath6kl_hif_power_off() to release dev->mutex
	mutex_unlock(&mutex); // device_unlock(parent) from ath9k_hif_usb_firmware_fail()
}
static void hub_event(struct work_struct *work)
{
	mutex_lock(&mutex); // usb_lock_device(hdev)
	flush_scheduled_work(); // ath6kl_usb_flush_all() from ath6kl_hif_power_off() waits for request_firmware_work_func() while holding dev->mutex
	mutex_unlock(&mutex); // usb_unlock_device(hdev)
}

static DECLARE_WORK(fw_work, request_firmware_work_func);
static DECLARE_WORK(hub_events, hub_event);

static int __init test_init(void)
{
	lockdep_set_novalidate_class(&mutex); // device_initialize() suppresses lockdep warning
	schedule_work(&fw_work); // request_firmware_nowait() from ath9k driver queues into system_wq
	queue_work(system_freezable_wq, &hub_events); // kick_hub_wq() from usb code queues into hub_wq
	return 0;
}

static void test_exit(void)
{
}

module_init(test_init);
module_exit(test_exit);
MODULE_LICENSE("GPL");
----------------------------------------

----------------------------------------
[   38.832553][ T2786] test: loading out-of-tree module taints kernel.
[  187.116969][   T35] INFO: task kworker/0:2:33 blocked for more than 143 seconds.
[  187.121366][   T35]       Tainted: G           O      5.19.0-rc4-next-20220701 #43
[  187.124830][   T35] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  187.128724][   T35] task:kworker/0:2     state:D stack:    0 pid:   33 ppid:     2 flags:0x00004000
[  187.133235][   T35] Workqueue: events_freezable hub_event [test]
[  187.136000][   T35] Call Trace:
[  187.137512][   T35]  <TASK>
[  187.138863][   T35]  __schedule+0x304/0x8f0
[  187.140923][   T35]  schedule+0x40/0xa0
[  187.142940][   T35]  schedule_timeout+0x300/0x3a0
[  187.145341][   T35]  ? mark_held_locks+0x55/0x80
[  187.147518][   T35]  ? wait_for_completion+0x6b/0x130
[  187.149257][   T35]  ? _raw_spin_unlock_irq+0x22/0x30
[  187.151187][   T35]  ? wait_for_completion+0x6b/0x130
[  187.152915][   T35]  ? trace_hardirqs_on+0x3b/0xd0
[  187.154388][   T35]  wait_for_completion+0x73/0x130
[  187.155547][   T35]  __flush_workqueue+0x17b/0x480
[  187.156710][   T35]  ? __mutex_lock+0x12b/0xe10
[  187.157703][   T35]  ? wait_for_completion+0x2d/0x130
[  187.158837][   T35]  hub_event+0x1e/0x30 [test]
[  187.160286][   T35]  ? hub_event+0x1e/0x30 [test]
[  187.161352][   T35]  process_one_work+0x292/0x570
[  187.162565][   T35]  worker_thread+0x2f/0x3d0
[  187.163590][   T35]  ? process_one_work+0x570/0x570
[  187.164779][   T35]  kthread+0xd6/0x100
[  187.165940][   T35]  ? kthread_complete_and_exit+0x20/0x20
[  187.167077][   T35]  ret_from_fork+0x1f/0x30
[  187.168319][   T35]  </TASK>
[  187.169190][   T35] INFO: task kworker/0:3:54 blocked for more than 143 seconds.
[  187.171458][   T35]       Tainted: G           O      5.19.0-rc4-next-20220701 #43
[  187.173380][   T35] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  187.175359][   T35] task:kworker/0:3     state:D stack:    0 pid:   54 ppid:     2 flags:0x00004000
[  187.177660][   T35] Workqueue: events request_firmware_work_func [test]
[  187.179221][   T35] Call Trace:
[  187.180381][   T35]  <TASK>
[  187.181170][   T35]  __schedule+0x304/0x8f0
[  187.182618][   T35]  schedule+0x40/0xa0
[  187.184207][   T35]  schedule_preempt_disabled+0x10/0x20
[  187.186318][   T35]  __mutex_lock+0x650/0xe10
[  187.188527][   T35]  ? request_firmware_work_func+0x1c/0x30 [test]
[  187.190094][   T35]  mutex_lock_nested+0x16/0x20
[  187.191905][   T35]  ? mutex_lock_nested+0x16/0x20
[  187.193159][   T35]  request_firmware_work_func+0x1c/0x30 [test]
[  187.194643][   T35]  process_one_work+0x292/0x570
[  187.195899][   T35]  worker_thread+0x2f/0x3d0
[  187.197031][   T35]  ? process_one_work+0x570/0x570
[  187.198489][   T35]  kthread+0xd6/0x100
[  187.199756][   T35]  ? kthread_complete_and_exit+0x20/0x20
[  187.201224][   T35]  ret_from_fork+0x1f/0x30
[  187.202374][   T35]  </TASK>
[  187.203293][   T35] 
[  187.203293][   T35] Showing all locks held in the system:
[  187.205200][   T35] 1 lock held by rcu_tasks_trace/10:
[  187.206489][   T35]  #0: ffffffffb25bdea0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x28/0x3b0
[  187.209383][   T35] 3 locks held by kworker/0:2/33:
[  187.210930][   T35]  #0: ffff9e2bc004d148 ((wq_completion)events_freezable){+.+.}-{0:0}, at: process_one_work+0x215/0x570
[  187.214012][   T35]  #1: ffffac448032fe68 (hub_events){+.+.}-{0:0}, at: process_one_work+0x215/0x570
[  187.217162][   T35]  #2: ffffffffc067b130 (&dev->mutex){....}-{3:3}, at: hub_event+0x12/0x30 [test]
[  187.220277][   T35] 1 lock held by khungtaskd/35:
[  187.221592][   T35]  #0: ffffffffb25be5c0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.56+0x0/0x30
[  187.224310][   T35] 3 locks held by kworker/0:3/54:
[  187.225596][   T35]  #0: ffff9e2bc004cb48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x215/0x570
[  187.228147][   T35]  #1: ffffac4480cebe68 (fw_work){+.+.}-{0:0}, at: process_one_work+0x215/0x570
[  187.230646][   T35]  #2: ffffffffc067b130 (&dev->mutex){....}-{3:3}, at: request_firmware_work_func+0x1c/0x30 [test]
[  187.234203][   T35] 2 locks held by agetty/2729:
[  187.235463][   T35]  #0: ffff9e2bc71740a0 (&tty->ldisc_sem){++++}-{0:0}, at: ldsem_down_read+0xe/0x10
[  187.237864][   T35]  #1: ffffac448229f2f8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x168/0x5f0
[  187.240571][   T35] 2 locks held by agetty/2731:
[  187.241876][   T35]  #0: ffff9e2bc907b8a0 (&tty->ldisc_sem){++++}-{0:0}, at: ldsem_down_read+0xe/0x10
[  187.244312][   T35]  #1: ffffac44822a72f8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x168/0x5f0
[  187.246938][   T35] 
[  187.247756][   T35] =============================================
[  187.247756][   T35] 
----------------------------------------

> However, it is always safe to acquire a child device's lock while 
> holding the parent's lock.  Lockdep isn't aware of this because it 
> doesn't understand the hierarchical device tree.  That's why lockdep 
> checking has to be disabled for dev->mutex; if it weren't disabled then 
> it would constantly report false positives.

Even if it is always safe to acquire a child device's lock while holding
the parent's lock, disabling lockdep checks completely on device's lock is
not safe.