I think there are no direct issues with initialization in the state how it is now. I suspect it's because "drm_kms_helper_poll_enable()" starts the first worker thread with a delay, which gives enough time to initialize required resources. I changed the initialization part to keep it consistent with the finish part, which is the one causing troubles.
I am not sure where I could move "drm_kms_helper_poll_enable/disable()", since it is defined in "drm/drm_probe_helper.c", which is only included in "nouveau_display.c" and "nouveau_connector.c". Both creating a new function in "nouveau_display.c", and including "probe_helper.h" and using poll_enable in a different file like "nouveau_fbcon.c" seem like too big changes for such small fix. I don't know.
Can this new proposed order break something in the finish part as well? Maybe it would be just better to change the order of "nouveau_drm_finish" and keep the current order of "noueau_drm_init"?
On Thu, May 5, 2022 at 9:57 PM Lyude Paul <lyude@xxxxxxxxxx> wrote:
Hmm, I think we might just need to move the drm_kms_helper_poll_enable() call
to the end here instead of all of nouveau_display_init(). I realized this
because in nouveau_display_init() it seems that we actually rely on
nouveau_display_init() to setup hotplug interrupts - which we do actually need
this early on in the driver probe process.
That being said though, drm_kms_helper_poll_enable() shouldn't be required for
MST short HPD IRQs from working so moving that instead should work.
On Wed, 2022-05-04 at 19:18 +0200, Mark Menzynski wrote:
> Resources needed for output poll workers are destroyed in
> nouveau_fbcon_fini() before output poll workers are cleared in
> nouveau_display_fini(). This means there is a time between fbcon_fini
> and display_fini, where if output poll happens, it crashes.
>
> BUG: KASAN: use-after-free in
> __drm_fb_helper_initial_config_and_unlock.cold+0x1f3/0x291
> [drm_kms_helper]
>
> Cc: Ben Skeggs <bskeggs@xxxxxxxxxx>
> Cc: Karol Herbst <kherbst@xxxxxxxxxx>
> Cc: Lyude Paul <lyude@xxxxxxxxxx>
> Cc: David Airlie <airlied@xxxxxxxx>
> Cc: Daniel Vetter <daniel@xxxxxxxx>
> Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
> Cc: "Christian König" <christian.koenig@xxxxxxx>
> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> Cc: nouveau@xxxxxxxxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: linux-media@xxxxxxxxxxxxxxx
> Cc: linaro-mm-sig@xxxxxxxxxxxxxxxx
> Signed-off-by: Mark Menzynski <mmenzyns@xxxxxxxxxx>
> ---
> drivers/gpu/drm/nouveau/nouveau_drm.c | 17 ++++++++---------
> 1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 561309d447e0..773efdd20d2f 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -588,12 +588,6 @@ nouveau_drm_device_init(struct drm_device *dev)
> if (ret)
> goto fail_dispctor;
>
> - if (dev->mode_config.num_crtc) {
> - ret = nouveau_display_init(dev, false, false);
> - if (ret)
> - goto fail_dispinit;
> - }
> -
> nouveau_debugfs_init(drm);
> nouveau_hwmon_init(dev);
> nouveau_svm_init(drm);
> @@ -601,6 +595,12 @@ nouveau_drm_device_init(struct drm_device *dev)
> nouveau_fbcon_init(dev);
> nouveau_led_init(dev);
>
> + if (dev->mode_config.num_crtc) {
> + ret = nouveau_display_init(dev, false, false);
> + if (ret)
> + goto fail_dispinit;
> + }
> +
> if (nouveau_pmops_runtime()) {
> pm_runtime_use_autosuspend(dev->dev);
> pm_runtime_set_autosuspend_delay(dev->dev, 5000);
> @@ -641,15 +641,14 @@ nouveau_drm_device_fini(struct drm_device *dev)
> pm_runtime_forbid(dev->dev);
> }
>
> + if (dev->mode_config.num_crtc)
> + nouveau_display_fini(dev, false, false);
> nouveau_led_fini(dev);
> nouveau_fbcon_fini(dev);
> nouveau_dmem_fini(drm);
> nouveau_svm_fini(drm);
> nouveau_hwmon_fini(dev);
> nouveau_debugfs_fini(drm);
> -
> - if (dev->mode_config.num_crtc)
> - nouveau_display_fini(dev, false, false);
> nouveau_display_destroy(dev);
>
> nouveau_accel_fini(drm);
--
Cheers,
Lyude Paul (she/her)
Software Engineer at Red Hat