Hi Danilo, Appologies again for the delay in reviewing this as I was at XDC last week. This patch is causing a regression at 'rmmod' time as the drm_crtc_vblank_off() does not get called when we disable outputs and the HDLCD remains active as I keep getting unhandled context faults from the arm-smmu driver that sits in front of the HDLCD. This is the stack trace for it: root@alarm ~]# rmmod hdlcd [ 198.981343] Console: switching to colour dummy device 80x25 [ 199.012492] ------------[ cut here ]------------ [ 199.017209] driver forgot to call drm_crtc_vblank_off() [ 199.023513] WARNING: CPU: 5 PID: 176 at drivers/gpu/drm/drm_atomic_helper.c:1236 disable_outputs+0x2c4/0x2d0 [drm_kms_helper] [ 199.035055] Modules linked in: hdlcd(-) drm_dma_helper tda998x cec drm_kms_helper drm [ 199.042929] CPU: 5 PID: 176 Comm: kworker/5:2 Not tainted 6.0.0-rc2-00007-ge17e6f0211cd #6 [ 199.051212] Hardware name: ARM Juno development board (r2) (DT) [ 199.057143] Workqueue: events drm_mode_rmfb_work_fn [drm] [ 199.062831] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 199.069809] pc : disable_outputs+0x2c4/0x2d0 [drm_kms_helper] [ 199.075664] lr : disable_outputs+0x2c4/0x2d0 [drm_kms_helper] [ 199.081519] sp : ffff80000a8f3b60 [ 199.084836] x29: ffff80000a8f3b60 x28: 0000000000000028 x27: ffff0008013962b8 [ 199.091996] x26: ffff800001049688 x25: ffff800001080520 x24: 0000000000000038 [ 199.099155] x23: 0000000000000000 x22: ffff000801396000 x21: ffff0008013966f0 [ 199.106314] x20: 0000000000000000 x19: ffff00080a65a800 x18: 0000000000000000 [ 199.113472] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 199.120630] x14: 00000000000000e1 x13: 0000000000000000 x12: 0000000000000000 [ 199.127788] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff80000a8f3910 [ 199.134947] x8 : ffff00080149d480 x7 : ffff00097efb5bc0 x6 : 0000000000000083 [ 199.142106] x5 : 0000000000000000 x4 : ffff00097efaea18 x3 : ffff00097efb1c20 [ 199.149264] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00080149c9c0 [ 199.156423] Call trace: [ 199.158870] disable_outputs+0x2c4/0x2d0 [drm_kms_helper] [ 199.164380] drm_atomic_helper_commit_tail+0x24/0xa0 [drm_kms_helper] [ 199.170933] commit_tail+0x150/0x180 [drm_kms_helper] [ 199.176093] drm_atomic_helper_commit+0x13c/0x370 [drm_kms_helper] [ 199.182384] drm_atomic_commit+0xa4/0xe0 [drm] [ 199.187074] drm_framebuffer_remove+0x434/0x4d4 [drm] [ 199.192374] drm_mode_rmfb_work_fn+0x74/0x9c [drm] [ 199.197413] process_one_work+0x1d4/0x330 [ 199.201437] worker_thread+0x220/0x430 [ 199.205195] kthread+0x108/0x10c [ 199.208430] ret_from_fork+0x10/0x20 [ 199.212015] ---[ end trace 0000000000000000 ]--- [ 199.221088] arm-smmu 7fb10000.iommu: Unhandled context fault: fsr=0x2, iova=0xffa53600, fsynr=0x182, cbfrsynra=0x0, cb=0 [ 199.232958] arm-smmu 7fb10000.iommu: Unhandled context fault: fsr=0x2, iova=0xff800000, fsynr=0x182, cbfrsynra=0x0, cb=0 [ 199.233228] ------------[ cut here ]------------ [ 199.248618] hdlcd 7ff50000.hdlcd: drm_WARN_ON(({ do { __attribute__((__noreturn__)) extern void __compiletime_assert_384(void) __attribute__((__error__("Unsupported access size for {READ,WRITE}_ONCE()."))); if (!((sizeof(vblank->enabled) == sizeof(char) || sizeof(vblank->enabled) == sizeof(short) || sizeof(vblank->🢱 enabled) == sizeof(int) || sizeof(vblank->enabled) == sizeof(long)) || sizeof(vblank->enabled) == sizeof(long long))) __compiletime_assert_384(); } while (0); (*(const volatile typeof( _Generic((vblank->enabled), char: (char)0, unsigned char: (unsigned char)0, signed char: (signed char)0, unsigned short: (unsigned sho🢱 rt)0, signed short: (signed short)0, unsigned int: (unsigned int)0, signed int: (signed int)0, unsigned long: (unsigned long)0, signed long: (signed long)0, unsigned long long: (unsigned long long)0, signed long long: (signed long long)0, default: (vblank->enabled))) *)&(vblank->enabled)); }) && drm_core_check_feature🢱 (dev, DRIVER_MODESET)) [ 199.248790] WARNING: CPU: 4 PID: 285 at drivers/gpu/drm/drm_vblank.c:504 drm_vblank_init_release+0x84/0xb0 [drm] [ 199.249751] arm-smmu 7fb10000.iommu: Unhandled context fault: fsr=0x2, iova=0xff800000, fsynr=0x182, cbfrsynra=0x0, cb=0 [ 199.291902] arm-scmi firmware:scmi: timed out in resp(caller: scmi_perf_level_set+0xfc/0x120) [ 199.291926] cpufreq: __target_index: Failed to change cpu frequency: -110 [ 199.334654] Modules linked in: hdlcd(-) [ 199.345063] scmi-cpufreq scmi_dev.2: Message for 857 type 0 is not expected! [ 199.355734] drm_dma_helper [ 199.371060] tda998x [ 199.384738] cec [ 199.408686] drm_kms_helper [ 199.421404] drm [ 199.436732] [ 199.450409] CPU: 4 PID: 285 Comm: rmmod Tainted: G W 6.0.0-rc2-00007-ge17e6f0211cd #6 [ 199.475499] Hardware name: ARM Juno development board (r2) (DT) [ 199.501541] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 199.508531] pc : drm_vblank_init_release+0x84/0xb0 [drm] [ 199.514314] lr : drm_vblank_init_release+0x84/0xb0 [drm] [ 199.519877] sp : ffff80000b183b30 [ 199.523195] x29: ffff80000b183b30 x28: ffff0008008b8000 x27: 0000000000000000 [ 199.530374] x26: 0000000000000000 x25: dead000000000100 x24: dead000000000122 [ 199.537550] x23: ffff000801396010 x22: ffff800001008010 x21: ffff000801396000 [ 199.544727] x20: ffff000801396000 x19: ffff000800398480 x18: fffffffffffeb478 [ 199.551904] x17: 000000040044ffff x16: 00400032b5503510 x15: 00000000000003d0 [ 199.559080] x14: 0000000000000001 x13: ffff800009f62f50 x12: 000000000000073e [ 199.566256] x11: 000000000000026a x10: ffff800009fbe940 x9 : ffff800009f62f50 [ 199.573432] x8 : 00000000ffffefff x7 : ffff800009fbaf50 x6 : 000000000000026a [ 199.580607] x5 : ffff00097ef9aa18 x4 : 0000000000000000 x3 : 0000000000000027 [ 199.587782] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008008b8000 [ 199.594957] Call trace: [ 199.597409] drm_vblank_init_release+0x84/0xb0 [drm] [ 199.602823] drm_managed_release+0xa8/0x140 [drm] [ 199.607973] drm_dev_put.part.0+0x78/0xb0 [drm] [ 199.612946] devm_drm_dev_init_release+0x18/0x30 [drm] [ 199.618529] devm_action_release+0x14/0x20 [ 199.622654] devres_release_group+0xe0/0x164 [ 199.626949] component_master_del+0xb0/0xc0 [ 199.631154] hdlcd_remove+0x1c/0x2c [hdlcd] [ 199.635368] platform_remove+0x28/0x60 [ 199.639138] device_remove+0x4c/0x80 [ 199.642731] device_release_driver_internal+0x1e4/0x250 [ 199.647979] driver_detach+0x50/0xe0 [ 199.651572] bus_remove_driver+0x5c/0xbc [ 199.655513] driver_unregister+0x30/0x60 [ 199.659454] platform_driver_unregister+0x14/0x20 [ 199.664181] hdlcd_platform_driver_exit+0x1c/0xe20 [hdlcd] [ 199.669698] __arm64_sys_delete_module+0x18c/0x240 [ 199.674513] invoke_syscall+0x48/0x114 [ 199.678286] el0_svc_common.constprop.0+0xcc/0xec [ 199.683015] do_el0_svc+0x2c/0xc0 [ 199.686350] el0_svc+0x2c/0x84 [ 199.689424] el0t_64_sync_handler+0x11c/0x150 [ 199.693803] el0t_64_sync+0x18c/0x190 [ 199.697485] ---[ end trace 0000000000000000 ]--- Looking at the documentation for drm_dev_unplug, you can get a hint about what is going on: /* * [....] There is one * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged before * drm_atomic_helper_shutdown() is called. This means that if the disable code * paths are protected, they will not run on regular driver module unload, * possibly leaving the hardware enabled. */ I've reverted this patch and I can remove most of the time, but I also get this crash from time to time: [root@alarm ~]# rmmod hdlcd [ 5442.625918] Console: switching to colour dummy device 80x25 [ 5442.662398] Unable to handle kernel paging request at virtual address ffff80000aa8c230 [ 5442.672018] Mem abort info: [ 5442.675537] ESR = 0x0000000096000047 [ 5442.679654] EC = 0x25: DABT (current EL), IL = 32 bits [ 5442.685355] SET = 0, FnV = 0 [ 5442.688759] EA = 0, S1PTW = 0 [ 5442.692122] FSC = 0x07: level 3 translation fault [ 5442.697031] Data abort info: [ 5442.699912] ISV = 0, ISS = 0x00000047 [ 5442.703766] CM = 0, WnR = 1 [ 5442.706749] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081a7f000 [ 5442.713493] [ffff80000aa8c230] pgd=10000009fffff003, p4d=10000009fffff003, pud=10000009ffffe003, pmd=10000008811dd003, pte=0000000000000000 [ 5442.726214] Internal error: Oops: 96000047 [#1] PREEMPT SMP [ 5442.731803] Modules linked in: psmouse libps2 onboard_usb_hub tda998x crct10dif_ce cec polyval_ce ambakmi polyval_generic serio hdlcd(-) drm_dma_helper arm_smccc_trng gpio_keys rng_core drm_kms_helper cfg80211 rfkill fuse drm ip_tables x_tables ipv6 [ 5442.753979] CPU: 0 PID: 274 Comm: rmmod Not tainted 6.0.0-rc2-00007-ge17e6f0211cd-dirty #7 [ 5442.760857] arm-scmi firmware:scmi: timed out in resp(caller: scmi_perf_level_set+0xfc/0x120) [ 5442.762258] Hardware name: ARM Juno development board (r2) (DT) [ 5442.762263] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 5442.770822] cpufreq: __target_index: Failed to change cpu frequency: -110 [ 5442.776718] pc : hdlcd_crtc_cleanup+0x44/0x84 [hdlcd] [ 5442.776735] lr : hdlcd_crtc_cleanup+0x30/0x84 [hdlcd] [ 5442.776744] sp : ffff80000a963a60 [ 5442.803918] x29: ffff80000a963a60 x28: ffff00080240e740 x27: 0000000000000000 [ 5442.811071] x26: 0000000000000000 x25: dead000000000100 x24: dead000000000122 [ 5442.816720] arm-scmi firmware:scmi: timed out in resp(caller: scmi_perf_level_set+0xfc/0x120) [ 5442.818222] x23: ffff000800d1f010 x22: ffff000800d1f2d8 x21: ffff000800d1f200 [ 5442.826803] cpufreq: __target_index: Failed to change cpu frequency: -110 [ 5442.833905] x20: ffff000800d1f000 x19: ffff000800d1f6f0 x18: ffffffffffffffff [ 5442.833916] x17: 30647261632f6d72 x16: 642f64636c64682e x15: 0000000000000001 [ 5442.833926] x14: 0000000000000004 x13: ffff000800d1f1e8 x12: 0000000000000000 [ 5442.862146] x11: ffff0008005aec68 x10: ffff0008005aeb58 x9 : 0000000000000000 [ 5442.869296] x8 : ffff0008005aeb80 x7 : 0000000000000000 x6 : 0000000000000228 [ 5442.876446] x5 : 000000000000082a x4 : 0000000000000000 x3 : 0000000000000001 [ 5442.883595] x2 : ffff00080240e740 x1 : 0000000000000000 x0 : ffff80000aa8c230 [ 5442.890746] Call trace: [ 5442.893189] hdlcd_crtc_cleanup+0x44/0x84 [hdlcd] [ 5442.897902] drm_mode_config_cleanup+0x150/0x2fc [drm] [ 5442.903160] drm_mode_config_init_release+0x10/0x20 [drm] [ 5442.908671] drm_managed_release+0xa8/0x140 [drm] [ 5442.913489] drm_dev_put.part.0+0x78/0xb0 [drm] [ 5442.918131] devm_drm_dev_init_release+0x18/0x30 [drm] [ 5442.923380] devm_action_release+0x14/0x20 [ 5442.927487] devres_release_group+0xe0/0x164 [ 5442.931763] component_master_del+0xb0/0xc0 [ 5442.935951] hdlcd_remove+0x1c/0x2c [hdlcd] [ 5442.940142] platform_remove+0x28/0x60 [ 5442.943894] device_remove+0x4c/0x80 [ 5442.947471] device_release_driver_internal+0x1e4/0x250 [ 5442.952703] driver_detach+0x50/0xe0 [ 5442.956280] bus_remove_driver+0x5c/0xbc [ 5442.960205] driver_unregister+0x30/0x60 [ 5442.964131] platform_driver_unregister+0x14/0x20 [ 5442.968840] hdlcd_platform_driver_exit+0x1c/0xe20 [hdlcd] [ 5442.974335] __arm64_sys_delete_module+0x18c/0x240 [ 5442.979133] invoke_syscall+0x48/0x114 [ 5442.982887] el0_svc_common.constprop.0+0xcc/0xec [ 5442.987597] do_el0_svc+0x2c/0xc0 [ 5442.990915] el0_svc+0x2c/0x84 [ 5442.993972] el0t_64_sync_handler+0x11c/0x150 [ 5442.998334] el0t_64_sync+0x18c/0x190 [ 5443.002001] Code: 540000e0 f85f0260 9108c000 d50332bf (b900001f) [ 5443.008103] ---[ end trace 0000000000000000 ]--- [ 5443.012793] arm-scmi firmware:scmi: timed out in resp(caller: scmi_perf_level_set+0xfc/0x120) [ 5443.021366] cpufreq: __target_index: Failed to change cpu frequency: -110 I'm finally getting to a conclusion: I'm still not sure what problem you were trying to solve when you have started this series and if you have found a scenario where you've got use after free then I would like to be able to reproduce it on my setup. Otherwise, I think this whole series introduces a regression on the behaviour of the driver and I would be inclined to reject it. Best regards, Liviu On Sat, Oct 01, 2022 at 03:19:02AM +0200, Danilo Krummrich wrote: > When the driver is unbound, there might still be users in userspace > having an open fd and are calling into the driver. > > While this is fine for drm managed resources, it is not for resources > bound to the device/driver lifecycle, e.g. clocks or MMIO mappings. > > To prevent use-after-free issues we need to protect those resources with > drm_dev_enter() and drm_dev_exit(). This does only work if we indicate > that the drm device was unplugged, hence use drm_dev_unplug() instead of > drm_dev_unregister(). > > Protecting the particular resources with drm_dev_enter()/drm_dev_exit() > is handled by subsequent patches. > > Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx> > --- > drivers/gpu/drm/arm/hdlcd_drv.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c > index 120c87934a91..e41def6d47cc 100644 > --- a/drivers/gpu/drm/arm/hdlcd_drv.c > +++ b/drivers/gpu/drm/arm/hdlcd_drv.c > @@ -325,7 +325,7 @@ static void hdlcd_drm_unbind(struct device *dev) > struct drm_device *drm = dev_get_drvdata(dev); > struct hdlcd_drm_private *hdlcd = drm_to_hdlcd_priv(drm); > > - drm_dev_unregister(drm); > + drm_dev_unplug(drm); > drm_kms_helper_poll_fini(drm); > component_unbind_all(dev, drm); > of_node_put(hdlcd->crtc.port); > -- > 2.37.3 > -- ==================== | I would like to | | fix the world, | | but they're not | | giving me the | \ source code! / --------------- ¯\_(ツ)_/¯