-----邮件原件----- 发件人: Liu Lucas/刘保柱 发送时间: 2023年8月1日 16:59 收件人: 'Liviu Dudau' <liviu.dudau@xxxxxxx> 抄送: airlied@xxxxxxxxx; daniel@xxxxxxxx; Huang Menghui/黄梦辉 <menghui.huang@xxxxxxxxxxxx> 主题: 回复: 回复: [PATCH] drm/komeda: drop all currently held locks if deadlock happens Ok , later, I will send a new patch. -----邮件原件----- 发件人: Liviu Dudau <liviu.dudau@xxxxxxx> 发送时间: 2023年8月1日 16:38 收件人: Liu Lucas/刘保柱 <lucas.liu@xxxxxxxxxxxx> 抄送: airlied@xxxxxxxxx; daniel@xxxxxxxx; Huang Menghui/黄梦辉 <menghui.huang@xxxxxxxxxxxx> 主题: Re: 回复: [PATCH] drm/komeda: drop all currently held locks if deadlock happens On Tue, Aug 01, 2023 at 02:16:46AM +0000, Liu Lucas/刘保柱 wrote: > Hello, Hi Liu, > > I'm sorry that the previous modification was not explained in detail. > (1). We are doing a deadlock detection on the kernel, so turn on the debug option for locks: > CONFIG_LOCKDEP=y > CONFIG_PROVE_LOCKING=y > CONFIG_LOCK_STAT=y > CONFIG_DEBUG_RT_MUTEXES=y > CONFIG_DEBUG_SPINLOCK=y > CONFIG_DEBUG_MUTEXES=y > CONFIG_DEBUG_RWSEMS=y > CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y > CONFIG_DEBUG_LOCK_ALLOC=y > CONFIG_DEBUG_ATOMIC_SLEEP=y > CONFIG_PROVE_RAW_LOCK_NESTING=y > > Problems with the CONFIG_DEBUG_WW_MUTEX_SLOWPATH option were found: > > ------------[ cut here ]------------ > WARNING: CPU: 2 PID: 345 at > drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c:1248 > komeda_release_unclaimed_resources+0x13c/0x170 > Modules linked in: > CPU: 2 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W 5.10.110-SE-SDK1.8-dirty #16 > Hardware name: Siengine Se1000 Evaluation board (DT) > pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) pc : > komeda_release_unclaimed_resources+0x13c/0x170 > lr : komeda_release_unclaimed_resources+0xbc/0x170 > sp : ffff800017b8b8d0 > pmr_save: 000000e0 > x29: ffff800017b8b8d0 x28: ffff000cf2f96200 > x27: ffff000c8f5a8800 x26: 0000000000000000 > x25: 0000000000000038 x24: ffff8000116a0140 > x23: 0000000000000038 x22: ffff000cf2f96200 > x21: ffff000cfc300300 x20: ffff000c8ab77080 > x19: 0000000000000003 x18: 0000000000000000 > x17: 0000000000000000 x16: 0000000000000000 > x15: b400e638f738ba38 x14: 0000000000000000 > x13: 0000000106400a00 x12: 0000000000000000 > x11: 0000000000000000 x10: 0000000000000000 > x9 : ffff800012f80000 x8 : ffff000ca3308000 > x7 : 0000000ff3000000 x6 : ffff80001084034c > x5 : ffff800017b8bc40 x4 : 000000000000000f > x3 : ffff000ca3308000 x2 : 0000000000000000 > x1 : 0000000000000000 x0 : ffffffffffffffdd Call trace: > komeda_release_unclaimed_resources+0x13c/0x170 > komeda_crtc_atomic_check+0x68/0xf0 > drm_atomic_helper_check_planes+0x138/0x1f4 > komeda_kms_check+0x284/0x36c > drm_atomic_check_only+0x40c/0x714 > drm_atomic_nonblocking_commit+0x1c/0x60 > drm_mode_atomic_ioctl+0xa3c/0xb8c > drm_ioctl_kernel+0xc4/0x120 > drm_ioctl+0x268/0x534 > __arm64_sys_ioctl+0xa8/0xf0 > el0_svc_common.constprop.0+0x80/0x240 > do_el0_svc+0x24/0x90 > el0_svc+0x20/0x30 > el0_sync_handler+0xe8/0xf0 > el0_sync+0x1a4/0x1c0 > irq event stamp: 0 > hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last > disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c > softirqs last enabled at (0): [<ffff800010056d34>] > copy_process+0x5d0/0x183c softirqs last disabled at (0): > [<0000000000000000>] 0x0 ---[ end trace 20ae984fa860184a ]--- > ------------[ cut here ]------------ > WARNING: CPU: 3 PID: 345 at drivers/gpu/drm/drm_modeset_lock.c:228 > drm_modeset_drop_locks+0x84/0x90 Modules linked in: > CPU: 3 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W 5.10.110-SE-SDK1.8-dirty #16 > Hardware name: Siengine Se1000 Evaluation board (DT) > pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) pc : > drm_modeset_drop_locks+0x84/0x90 lr : > drm_mode_atomic_ioctl+0x860/0xb8c sp : ffff800017b8bb10 > pmr_save: 000000e0 > x29: ffff800017b8bb10 x28: 0000000000000001 > x27: 0000000000000038 x26: 0000000000000002 > x25: ffff000cecbefa00 x24: ffff000cf2f96200 > x23: 0000000000000001 x22: 0000000000000018 > x21: 0000000000000001 x20: ffff800017b8bc10 > x19: 0000000000000000 x18: 0000000000000000 > x17: 0000000002e8bf2c x16: 0000000002e94c6b > x15: 0000000002ea48b9 x14: ffff8000121f0300 > x13: 0000000002ee2ca8 x12: ffff80001129cae0 > x11: ffff800012435000 x10: ffff000ed46b5e88 > x9 : ffff000c9935e600 x8 : 0000000000000000 > x7 : 000000008020001e x6 : 000000008020001f > x5 : ffff80001085fbe0 x4 : fffffe0033a59f20 > x3 : 000000008020001e x2 : 0000000000000000 > x1 : 0000000000000000 x0 : ffff000c8f596090 Call trace: > drm_modeset_drop_locks+0x84/0x90 > drm_mode_atomic_ioctl+0x860/0xb8c > drm_ioctl_kernel+0xc4/0x120 > drm_ioctl+0x268/0x534 > __arm64_sys_ioctl+0xa8/0xf0 > el0_svc_common.constprop.0+0x80/0x240 > do_el0_svc+0x24/0x90 > el0_svc+0x20/0x30 > el0_sync_handler+0xe8/0xf0 > el0_sync+0x1a4/0x1c0 > irq event stamp: 0 > hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last > disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c > softirqs last enabled at (0): [<ffff800010056d34>] > copy_process+0x5d0/0x183c softirqs last disabled at (0): > [<0000000000000000>] 0x0 ---[ end trace 20ae984fa860184b ]--- > > (2). According to the call trace information, it can be located to be WARN_ON(IS_ERR(c_st)) in the komeda_pipeline_unbound_components function; Then follow the function. > komeda_pipeline_unbound_components > -> komeda_component_get_state_and_set_user > -> komeda_pipeline_get_state_and_set_crtc > -> komeda_pipeline_get_state > ->drm_atomic_get_private_obj_state > -> drm_atomic_get_private_obj_state > -> drm_modeset_lock > > > komeda_pipeline_unbound_components > -> komeda_component_get_state_and_set_user > -> komeda_component_get_state > -> drm_atomic_get_private_obj_state > -> drm_modeset_lock > > ret = drm_modeset_lock(&obj->lock, state->acquire_ctx); if (ret) > return ERR_PTR(ret); > Here it return -EDEADLK. > > (3). Therefore, deal with the deadlock as suggested by [1], using the > function drm_modeset_backoff(). > [1] > https://docs.kernel.org/gpu/drm-kms.html?highlight=kms#kms-locking > > According to the call trace information: > Call trace: > komeda_release_unclaimed_resources+0x13c/0x170 > komeda_crtc_atomic_check+0x68/0xf0 > drm_atomic_helper_check_planes+0x138/0x1f4 > komeda_kms_check+0x284/0x36c > drm_atomic_check_only+0x40c/0x714 > drm_atomic_nonblocking_commit+0x1c/0x60 > drm_mode_atomic_ioctl+0xa3c/0xb8c > drm_ioctl_kernel+0xc4/0x120 > drm_ioctl+0x268/0x534 This is a much better description of the problem that contains the relevant information for me. Can you please send a new revision of the patch where the commit message is the text above? With that, you can have my Reviewed-by tag. Best regards, Liviu > > Add a determination of the return value to the function komeda_pipeline_unbound_components, Finally, komeda_release_unclaimed_resources returns to the drm_mode_atomic_ioctl function to call drm_modeset_backoff based on the value of ret. > > (4). WARN_ON(IS_ERR(c_st)); This code can be retained, the modification was not carefully considered at that time, I will modify this patch again later and submit a new one. > c_st = komeda_component_get_state_and_set_user(c, > drm_st, NULL, new->crtc); > - WARN_ON(IS_ERR(c_st)); > + if (PTR_ERR(c_st) == -EDEADLK) > + return -EDEADLK; > + else > + WARN_ON(IS_ERR(c_st)); > > -----邮件原件----- > 发件人: Liviu Dudau <liviu.dudau@xxxxxxx> > 发送时间: 2023年7月31日 22:11 > 收件人: Huang Menghui/黄梦辉 <menghui.huang@xxxxxxxxxxxx> > 抄送: airlied@xxxxxxxxx; daniel@xxxxxxxx; Liu Lucas/刘保柱 > <lucas.liu@xxxxxxxxxxxx> > 主题: Re: [PATCH] drm/komeda: drop all currently held locks if deadlock > happens > > Hello, > > On Mon, Jul 31, 2023 at 04:08:43PM +0800, menghui.huang wrote: > > From: "baozhu.liu" <lucas.liu@xxxxxxxxxxxx> > > > > If komeda_pipeline_unbound_components() returns -EDEADLK, it means > > that a deadlock happened in the locking context. > > Currently, komeda is not dealing with the deadlock properly, > > producing the following output when CONFIG_DEBUG_WW_MUTEX_SLOWPATH is enabled: > > > > ------------[ cut here ]------------ > > WARNING: CPU: 2 PID: 345 at drivers/gpu/drm/drm_modeset_lock.c:228 > > drm_modeset_drop_locks+0x84/0x90 Modules linked in: > > CPU: 2 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W > > Hardware name: Siengine Se1000 Evaluation board (DT) > > pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) pc : > > drm_modeset_drop_locks+0x84/0x90 lr : > > drm_mode_atomic_ioctl+0x860/0xb8c sp : ffff800017b8bb10 > > pmr_save: 000000e0 > > x29: ffff800017b8bb10 x28: 0000000000000001 > > x27: 0000000000000038 x26: 0000000000000002 > > x25: ffff000d03a6cc00 x24: ffff000cf2fe9000 > > x23: 0000000000000001 x22: 0000000000000018 > > x21: 0000000000000001 x20: ffff800017b8bc10 > > x19: 0000000000000000 x18: 0000000000000000 > > x17: 000000000000a2bb x16: 0000000000f77f9a > > x15: 00000000025a0830 x14: ffff8000121f0300 > > x13: 0000000005740083 x12: ffff80001129cae0 > > x11: ffff800012435000 x10: ffff000ed46b5e88 > > x9 : ffff000c9935e600 x8 : 0000000000000000 > > x7 : 0000000000000001 x6 : 00000000000ba57a > > x5 : ffff000cf458e400 x4 : ffff000ed4c528a0 > > x3 : 00000000000ba582 x2 : 0000000000000000 > > x1 : 0000000000000000 x0 : ffff000c8f597290 Call trace: > > drm_modeset_drop_locks+0x84/0x90 > > drm_mode_atomic_ioctl+0x860/0xb8c > > drm_ioctl_kernel+0xc4/0x120 > > drm_ioctl+0x268/0x534 > > __arm64_sys_ioctl+0xa8/0xf0 > > el0_svc_common.constprop.0+0x80/0x240 > > do_el0_svc+0x24/0x90 > > el0_svc+0x20/0x30 > > el0_sync_handler+0xe8/0xf0 > > el0_sync+0x1a4/0x1c0 > > irq event stamp: 0 > > hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs > > last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c > > softirqs last enabled at (0): [<ffff800010056d34>] > > copy_process+0x5d0/0x183c softirqs last disabled at (0): > > [<0000000000000000>] 0x0 ---[ end trace 20ae984fa8601849 ]--- > > > > Therefore, handling this deadlock can be solved by adding return > > -EDEADLK back to the drm_modeset_backoff processing flow in the > > drm_mode_atomic_ioctl function. > > > > Signed-off-by: baozhu.liu <lucas.liu@xxxxxxxxxxxx> > > --- > > .../gpu/drm/arm/display/komeda/komeda_pipeline_state.c | 10 > > ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git > > a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c > > b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c > > index 3276a3e82c62..8ad021259a37 100644 > > --- a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c > > +++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c > > @@ -1223,7 +1223,7 @@ int komeda_build_display_data_flow(struct komeda_crtc *kcrtc, > > return 0; > > } > > > > -static void > > +static int > > komeda_pipeline_unbound_components(struct komeda_pipeline *pipe, > > struct komeda_pipeline_state *new) { @@ -1243,8 +1243,11 @@ > > komeda_pipeline_unbound_components(struct komeda_pipeline *pipe, > > c = komeda_pipeline_get_component(pipe, id); > > c_st = komeda_component_get_state_and_set_user(c, > > drm_st, NULL, new->crtc); > > - WARN_ON(IS_ERR(c_st)); > > + if (PTR_ERR(c_st) == -EDEADLK) > > + return -EDEADLK; > > } > > + > > + return 0; > > } > > > > /* release unclaimed pipeline resource */ @@ -1266,9 +1269,8 @@ int > > komeda_release_unclaimed_resources(struct komeda_pipeline *pipe, > > if (WARN_ON(IS_ERR_OR_NULL(st))) > > return -EINVAL; > > > > - komeda_pipeline_unbound_components(pipe, st); > > + return komeda_pipeline_unbound_components(pipe, st); > > > > - return 0; > > } > > > > /* Since standalone disabled components must be disabled separately > > and in the > > -- > > 2.17.1 > > > > Thank you for sending this patch, but commit message need a bit more clarity about what you're trying to do and why you think this patch solves it. > > First of all, the commit title talks about dropping locks which is obviously not what the patch is doing here. Second, the patch replaces a WARN_ON() which I would expect to be printed in the log fragment that you have provided, but instead I see the WARNING() from drm_modeset_lock.c:228. I cannot see from the call trace where you've hit a komeda function call, so something is missing. > > Can you also provide us with information on the kernel version you're using? > For me, line 228 is on an empty line inside > drm_warn_on_modeset_not_all_locked() > and drm_modeset_drop_locks() only starts at line 274. > > Best regards, > Liviu > > > -- > ==================== > | I would like to | > | fix the world, | > | but they're not | > | giving me the | > \ source code! / > --------------- > ¯\_(ツ)_/¯ -- ==================== | I would like to | | fix the world, | | but they're not | | giving me the | \ source code! / --------------- ¯\_(ツ)_/¯