Re: lockdep and ww mutex debug interactions in hdmi tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 07, 2024 at 06:26:23AM +1000, Dave Airlie wrote:
> On Fri, 1 Nov 2024 at 23:57, Maxime Ripard <mripard@xxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > On Wed, Oct 30, 2024 at 05:03:50AM +1000, Dave Airlie wrote:
> > > Hi,
> > >
> > > I mentioned this internally, but wanted to get it on the list,
> > >
> > > I ran the hdmi kunit tests with LOCKDEP and WW_MUTEX_SLOWPATH enabled
> > > and hit some issues.
> > >
> > > With the slowpath we get the occasional EDEADLK to test the paths are
> > > doing things right, I think you should handle EDEADLK in the tests
> > > with a retry loop.
> >
> > Thanks for the report, I've just sent a patch fixing this.
> 
> The patch fixes the EDEADLK but not the lockdep
> 
> [   50.785446] KTAP version 1
> [   50.785461] 1..2
> [   50.786298]     KTAP version 1
> [   50.786305]     # Subtest: drm_atomic_helper_connector_hdmi_check
> [   50.786308]     # module: drm_hdmi_state_helper_test
> [   50.786312]     1..22
> 
> [   50.788096] ======================================================
> [   50.788101] WARNING: possible circular locking dependency detected
> [   50.788107] 6.12.0-rc6+ #47 Tainted: G                 N
> [   50.788112] ------------------------------------------------------
> [   50.788117] kunit_try_catch/1500 is trying to acquire lock:
> [   50.788123] ffff9976410cc4f0 (&dev->mode_config.mutex){+.+.}-{3:3},
> at: drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788141]
>                but task is already holding lock:
> [   50.788146] ffff9976be5550f0 (crtc_ww_class_acquire){+.+.}-{0:0},
> at: drm_kunit_helper_acquire_ctx_alloc+0x4d/0xc0 [drm_kunit_helpers]
> [   50.788159]
>                which lock already depends on the new lock.
> 
> [   50.788165]
>                the existing dependency chain (in reverse order) is:
> [   50.788171]
>                -> #1 (crtc_ww_class_acquire){+.+.}-{0:0}:
> [   50.788179]        drm_modeset_acquire_init+0xd7/0x110 [drm]
> [   50.788235]
> drm_helper_probe_single_connector_modes+0x4c/0x600 [drm_kms_helper]
> [   50.788266]        set_connector_edid.isra.0+0x4f/0xc0
> [drm_hdmi_state_helper_test]
> [   50.788275]
> drm_atomic_helper_connector_hdmi_init+0x240/0x400
> [drm_hdmi_state_helper_test]
> [   50.788285]
> drm_test_check_broadcast_rgb_auto_cea_mode+0x27/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788296]        kunit_try_run_case+0x62/0xd0 [kunit]
> [   50.788304]        kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
> [   50.788313]        kthread+0xef/0x120
> [   50.788318]        ret_from_fork+0x31/0x50
> [   50.788324]        ret_from_fork_asm+0x1a/0x30
> [   50.788329]
>                -> #0 (&dev->mode_config.mutex){+.+.}-{3:3}:
> [   50.788337]        __lock_acquire+0x1391/0x2190
> [   50.788343]        lock_acquire+0xcc/0x2d0
> [   50.788348]        __mutex_lock+0x8d/0xbf0
> [   50.788353]
> drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788363]        kunit_try_run_case+0x62/0xd0 [kunit]
> [   50.788371]        kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
> [   50.788380]        kthread+0xef/0x120
> [   50.788384]        ret_from_fork+0x31/0x50
> [   50.788388]        ret_from_fork_asm+0x1a/0x30
> [   50.788393]
>                other info that might help us debug this:
> 
> [   50.788400]  Possible unsafe locking scenario:
> 
> [   50.788405]        CPU0                    CPU1
> [   50.788409]        ----                    ----
> [   50.788413]   lock(crtc_ww_class_acquire);
> [   50.788418]                                lock(&dev->mode_config.mutex);
> [   50.788424]                                lock(crtc_ww_class_acquire);
> [   50.788431]   lock(&dev->mode_config.mutex);
> [   50.788435]
>                 *** DEADLOCK ***
> 
> [   50.788441] 1 lock held by kunit_try_catch/1500:
> [   50.788445]  #0: ffff9976be5550f0
> (crtc_ww_class_acquire){+.+.}-{0:0}, at:
> drm_kunit_helper_acquire_ctx_alloc+0x4d/0xc0 [drm_kunit_helpers]
> [   50.788459]
>                stack backtrace:
> [   50.788464] CPU: 5 UID: 0 PID: 1500 Comm: kunit_try_catch Tainted:
> G                 N 6.12.0-rc6+ #47
> [   50.788473] Tainted: [N]=TEST
> [   50.788476] Hardware name: Gigabyte Technology Co., Ltd. Z390 I
> AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
> [   50.788485] Call Trace:
> [   50.788488]  <TASK>
> [   50.788492]  dump_stack_lvl+0x6c/0xa0
> [   50.788498]  print_circular_bug.cold+0x178/0x1be
> [   50.788506]  check_noncircular+0x10f/0x120
> [   50.788511]  ? stack_trace_save+0x3e/0x50
> [   50.788520]  __lock_acquire+0x1391/0x2190
> [   50.788528]  lock_acquire+0xcc/0x2d0
> [   50.788533]  ?
> drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788544]  ? lock_is_held_type+0xd9/0x130
> [   50.788552]  __mutex_lock+0x8d/0xbf0
> [   50.788556]  ?
> drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788566]  ? _raw_spin_unlock_irqrestore+0x39/0x70
> [   50.788573]  ? kunit_add_action+0xd1/0x140 [kunit]
> [   50.788581]  ?
> drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788592]  ? __pfx_action_drm_release_context+0x10/0x10 [drm_kunit_helpers]
> [   50.788599]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit]
> [   50.788608]  ? kunit_add_action_or_reset+0x18/0x40 [kunit]
> [   50.788618]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit]
> [   50.788627]  ?
> drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788637]  drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
> [drm_hdmi_state_helper_test]
> [   50.788647]  ? lockdep_hardirqs_on+0x7c/0x100
> [   50.788654]  kunit_try_run_case+0x62/0xd0 [kunit]
> [   50.788662]  ? lockdep_hardirqs_on+0x7c/0x100
> [   50.788668]  ? _raw_spin_unlock_irqrestore+0x39/0x70
> [   50.788675]  kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
> [   50.788684]  kthread+0xef/0x120
> [   50.788688]  ? __pfx_kthread+0x10/0x10
> [   50.788693]  ret_from_fork+0x31/0x50
> [   50.788698]  ? __pfx_kthread+0x10/0x10
> [   50.788703]  ret_from_fork_asm+0x1a/0x30
> [   50.788711]  </TASK>

Yeah, sorry. I finally got the time to look into it a bit, and it's
(almost) fixed now:
https://lore.kernel.org/dri-devel/20250129-test-kunit-v2-0-fe59c43805d5@xxxxxxxxxx/

Maxime

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux