Hello Maarten, > -----Original Message----- > From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of > Maarten Lankhorst > Sent: Wednesday, January 31, 2024 5:40 PM > To: intel-xe@xxxxxxxxxxxxxxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: [CI 2/3] drm/i915: Use vblank worker to unpin old legacy cursor fb > safely > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > The cursor hardware only does sync updates, and thus the hardware will be > scanning out from the old fb until the next start of vblank. > So in order to make the legacy cursor fastpath actually safe we should not > unpin the old fb until we're sure the hardware has ceased accessing it. The > simplest approach is to just use a vblank work here to do the delayed unpin. > > Not 100% sure it's a good idea to put this onto the same high priority vblank > worker as eg. our timing critical gamma updates. > But let's keep it simple for now, and it we later discover that this is causing > problems we can think about adding a lower priority worker for such things. > > This patch is slightly reworked by Maarten > We have been looking into the CI regression which were seen with this patch series. As far as we understand, there are multiple code paths which manipulate the plane state. They can be called "asynchronously" at any time which is leading to the random dump stacks and NULL pointer references. [1] and [2] are some of the examples. We have identified the following code paths. 1. vblank fb unpin worker 2. intel_atomic_cleanup_work 3. legacy cursor ioctl 4. atomic commit ioctls Another interesting finding was that, in at least the ADL-P, the vblank unpin worker was never scheduled from the legacy cursor update call. In other words, the following condition was never true within intel_cursor_unpin_work if (old_plane_state->ggtt_vma != new_plane_state->ggtt_vma) The vblank worker was exclusively scheduled from intel_pipe_update_end. We tried a few things to work around these issues. 1. add a check in the plane state destroy hook to not move forward if the vblank worker is scheduled. 2. add checks before accessing frame buffer object (we are not sure yet how much this helps but we have found that this operation causes dump stacks) 3. do not defer the intel atomic cleanup into a work queue, instead execute it at the end of atomic commit tail. With these changes we were able to get good results in trybot.[3] There were some "Potential atomic update failure" issues but in a discussion with Ville we concluded that these can be ignored. I will float an RFC series with these changes to trigger a discussion. Regards Chaitanya [1] <4> [436.749743] Call Trace: <4> [436.749746] <TASK> <4> [436.749748] ? __die_body+0x1a/0x60 <4> [436.749756] ? die_addr+0x38/0x60 <4> [436.749759] ? exc_general_protection+0x1a2/0x400 <4> [436.749767] ? asm_exc_general_protection+0x26/0x30 <4> [436.749773] ? process_scheduled_works+0x264/0x530 <4> [436.749778] ? __pfx_intel_cleanup_plane_fb+0x10/0x10 [i915] <4> [436.750085] ? intel_cleanup_plane_fb+0x10/0x90 [i915] <4> [436.750391] drm_atomic_helper_cleanup_planes+0x42/0x60 <4> [436.750398] intel_atomic_cleanup_work+0x66/0xb0 [i915] <4> [436.750704] ? process_scheduled_works+0x264/0x530 <4> [436.750709] process_scheduled_works+0x2db/0x530 <4> [436.750715] ? __pfx_worker_thread+0x10/0x10 <4> [436.750718] worker_thread+0x18c/0x350 <4> [436.750722] ? __pfx_worker_thread+0x10/0x10 <4> [436.750725] kthread+0xfe/0x130 <4> [436.750730] ? __pfx_kthread+0x10/0x10 <4> [436.750735] ret_from_fork+0x2c/0x50 <4> [436.750739] ? __pfx_kthread+0x10/0x10 <4> [436.750743] ret_from_fork_asm+0x1b/0x30 <4> [436.750750] </TASK> [2] 613.496510] Call Trace: [ 613.496514] <TASK> [ 613.496516] ? __die_body+0x1a/0x60 [ 613.496525] ? page_fault_oops+0x156/0x450 [ 613.496530] ? do_user_addr_fault+0x65/0x9e0 [ 613.496536] ? exc_page_fault+0x68/0x1a0 [ 613.496543] ? asm_exc_page_fault+0x26/0x30 [ 613.496551] ? intel_display_rps_mark_interactive+0x4/0x40 [i915] [ 613.496860] intel_cleanup_plane_fb+0x5d/0xc0 [i915] [ 613.497173] drm_atomic_helper_cleanup_planes+0x42/0x60 [ 613.497181] intel_atomic_cleanup_work+0x70/0xc0 [i915] [ 613.497493] ? process_scheduled_works+0x264/0x530 [ 613.497498] process_scheduled_works+0x2db/0x530 [ 613.497504] ? __pfx_worker_thread+0x10/0x10 [ 613.497507] worker_thread+0x18c/0x350 [ 613.497511] ? __pfx_worker_thread+0x10/0x10 [ 613.497514] kthread+0xfe/0x130 [ 613.497520] ? __pfx_kthread+0x10/0x10 [ 613.497524] ret_from_fork+0x2c/0x50 [ 613.497528] ? __pfx_kthread+0x10/0x10 [ 613.497532] ret_from_fork_asm+0x1b/0x30 [ 613.497540] </TASK> [3] https://patchwork.freedesktop.org/series/128430/#rev7 > Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx> > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/display/intel_cursor.c | 26 +++++++++++++++++-- > drivers/gpu/drm/i915/display/intel_display.c | 3 +++ > .../drm/i915/display/intel_display_types.h | 3 +++ > 3 files changed, 30 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c > b/drivers/gpu/drm/i915/display/intel_cursor.c > index 926e2de00eb58..64bdf0eb7943c 100644 > --- a/drivers/gpu/drm/i915/display/intel_cursor.c > +++ b/drivers/gpu/drm/i915/display/intel_cursor.c > @@ -635,6 +635,17 @@ static bool > intel_cursor_format_mod_supported(struct drm_plane *_plane, > return format == DRM_FORMAT_ARGB8888; > } > > +static void intel_cursor_unpin_work(struct kthread_work *base) { > + struct drm_vblank_work *work = to_drm_vblank_work(base); > + struct intel_plane_state *plane_state = > + container_of(work, typeof(*plane_state), unpin_work); > + struct intel_plane *plane = to_intel_plane(plane_state->uapi.plane); > + > + intel_plane_unpin_fb(plane_state); > + intel_plane_destroy_state(&plane->base, &plane_state->uapi); } > + > static int > intel_legacy_cursor_update(struct drm_plane *_plane, > struct drm_crtc *_crtc, > @@ -762,14 +773,25 @@ intel_legacy_cursor_update(struct drm_plane > *_plane, > > local_irq_enable(); > > - intel_plane_unpin_fb(old_plane_state); > + if (old_plane_state->ggtt_vma != new_plane_state->ggtt_vma) { > + drm_vblank_work_init(&old_plane_state->unpin_work, > &crtc->base, > + intel_cursor_unpin_work); > + > + drm_vblank_work_schedule(&old_plane_state->unpin_work, > + > drm_crtc_accurate_vblank_count(&crtc->base) + 1, > + false); > + > + old_plane_state = NULL; > + } else { > + intel_plane_unpin_fb(old_plane_state); > + } > > out_free: > if (new_crtc_state) > intel_crtc_destroy_state(&crtc->base, &new_crtc_state- > >uapi); > if (ret) > intel_plane_destroy_state(&plane->base, &new_plane_state- > >uapi); > - else > + else if (old_plane_state) > intel_plane_destroy_state(&plane->base, &old_plane_state- > >uapi); > return ret; > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > b/drivers/gpu/drm/i915/display/intel_display.c > index b10aad15a63d9..b3d73ded097c4 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -64,6 +64,7 @@ > #include "intel_crt.h" > #include "intel_crtc.h" > #include "intel_crtc_state_dump.h" > +#include "intel_cursor.h" > #include "intel_ddi.h" > #include "intel_de.h" > #include "intel_display_driver.h" > @@ -6771,6 +6772,8 @@ static void intel_commit_modeset_disables(struct > intel_atomic_state *state) > continue; > > intel_crtc_disable_planes(state, crtc); > + > + drm_vblank_work_flush_all(&crtc->base); > } > > /* Only disable port sync and MST slaves */ diff --git > a/drivers/gpu/drm/i915/display/intel_display_types.h > b/drivers/gpu/drm/i915/display/intel_display_types.h > index 3fdd8a5179831..b7faeeb5ddc1c 100644 > --- a/drivers/gpu/drm/i915/display/intel_display_types.h > +++ b/drivers/gpu/drm/i915/display/intel_display_types.h > @@ -713,6 +713,9 @@ struct intel_plane_state { > > struct intel_fb_view view; > > + /* for legacy cursor fb unpin */ > + struct drm_vblank_work unpin_work; > + > /* Plane pxp decryption state */ > bool decrypt; > > -- > 2.43.0