On Wed, 8 Mar 2023 07:17:38 +0100 Christian König <christian.koenig@xxxxxxx> wrote: > Am 08.03.23 um 03:26 schrieb Steven Rostedt: > > On Tue, 7 Mar 2023 21:22:23 -0500 > > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > >> Looks like there was a lock possibly used after free. But as commit > >> 9bff18d13473a9fdf81d5158248472a9d8ecf2bd ("drm/ttm: use per BO cleanup > >> workers") changed a lot of this code, I figured it may be the culprit. > > If I bothered to look at the second warning after this one (I usually stop > > after the first), it appears to state there was a use after free issue. > > Yeah, that looks like the reference count was somehow messed up. > > What test case/environment do you run to trigger this? > > Thanks for the notice, I'm still getting this on Linus's latest tree. [ 230.530222] ------------[ cut here ]------------ [ 230.569795] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 230.569957] WARNING: CPU: 0 PID: 212 at kernel/locking/mutex.c:582 __ww_mutex_lock.constprop.0+0x62a/0x1300 [ 230.612599] Modules linked in: [ 230.632144] CPU: 0 PID: 212 Comm: kworker/0:8H Not tainted 6.3.0-rc2-test-00047-g6015b1aca1a2-dirty #992 [ 230.654939] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 230.678866] Workqueue: ttm ttm_bo_delayed_delete [ 230.699452] EIP: __ww_mutex_lock.constprop.0+0x62a/0x1300 [ 230.720582] Code: e8 3b 9a 95 ff 85 c0 0f 84 61 fa ff ff 8b 0d 58 bc 3a c4 85 c9 0f 85 53 fa ff ff 68 54 98 06 c4 68 b7 b6 04 c4 e8 46 af 40 ff <0f> 0b 58 5a e9 3b fa ff ff 8d 74 26 00 90 a1 ec 47 b0 c4 85 c0 75 [ 230.768336] EAX: 00000028 EBX: 00000000 ECX: c51abdd8 EDX: 00000002 [ 230.792001] ESI: 00000000 EDI: c53856bc EBP: c51abf00 ESP: c51abeac [ 230.815944] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010246 [ 230.840033] CR0: 80050033 CR2: ff9ff000 CR3: 04506000 CR4: 00150ef0 [ 230.864059] Call Trace: [ 230.886369] ? ttm_bo_delayed_delete+0x30/0x94 [ 230.909902] ww_mutex_lock+0x32/0x94 [ 230.932550] ttm_bo_delayed_delete+0x30/0x94 [ 230.955798] process_one_work+0x21a/0x484 [ 230.979335] worker_thread+0x14a/0x39c [ 231.002258] kthread+0xea/0x10c [ 231.024769] ? process_one_work+0x484/0x484 [ 231.047870] ? kthread_complete_and_exit+0x1c/0x1c [ 231.071498] ret_from_fork+0x1c/0x28 [ 231.094701] irq event stamp: 4023 [ 231.117272] hardirqs last enabled at (4023): [<c3d1df99>] _raw_spin_unlock_irqrestore+0x2d/0x58 [ 231.143217] hardirqs last disabled at (4022): [<c31d5a55>] kvfree_call_rcu+0x155/0x2ec [ 231.166058] softirqs last enabled at (3460): [<c3d1f403>] __do_softirq+0x2c3/0x3bb [ 231.183104] softirqs last disabled at (3455): [<c30c96a9>] call_on_stack+0x45/0x4c [ 231.200336] ---[ end trace 0000000000000000 ]--- [ 231.216572] ------------[ cut here ]------------ This is preventing me from adding any of my own patches on v6.3-rcX due to this bug failing my tests. Which means I can't add anything to linux-next until this is fixed! -- Steve
Attachment:
config-fail
Description: Binary data