On 07/22, daniel@xxxxxxxx wrote: > On Wed, Jul 22, 2020 at 08:04:11AM -0300, Melissa Wen wrote: > > This patch adds a missing drm_crtc_vblank_put op to the pair > > drm_crtc_vblank_get/put (inc/decrement counter to guarantee vblanks). > > > > It clears the execution of the following kms_cursor_crc subtests: > > 1. pipe-A-cursor-[size,alpha-opaque, NxN-(on-screen, off-screen, sliding, > > random, fast-moving])] - successful when running individually. > > 2. pipe-A-cursor-dpms passes again > > 3. pipe-A-cursor-suspend also passes > > > > The issue was initially tracked in the sequential execution of IGT > > kms_cursor_crc subtest: when running the test sequence or one of its > > subtests twice, the odd execs complete and the pairs get stuck in an > > endless wait. In the IGT code, calling a wait_for_vblank before the start > > of CRC capture prevented the busy-wait. But the problem persisted in the > > pipe-A-cursor-dpms and -suspend subtests. > > > > Checking the history, the pipe-A-cursor-dpms subtest was successful when, > > in vkms_atomic_commit_tail, instead of using the flip_done op, it used > > wait_for_vblanks. Another way to prevent blocking was wait_one_vblank when > > enabling crtc. However, in both cases, pipe-A-cursor-suspend persisted > > blocking in the 2nd start of CRC capture, which may indicate that > > something got stuck in the step of CRC setup. Indeed, wait_one_vblank in > > the crc setup was able to sync things and free all kms_cursor_crc > > subtests. > > > > Tracing and comparing a clean run with a blocked one: > > - in a clean one, vkms_crtc_atomic_flush enables vblanks; > > - when blocked, only in next op, vkms_crtc_atomic_enable, the vblanks > > started. Moreover, a series of vkms_vblank_simulate flow out until > > disabling vblanks. > > Also watching the steps of vkms_crtc_atomic_flush, when the very first > > drm_crtc_vblank_get returned an error, the subtest crashed. On the other > > hand, when vblank_get succeeded, the subtest completed. Finally, checking > > the flush steps: it increases counter to hold a vblank reference (get), > > but there isn't a op to decreased it and release vblanks (put). > > > > Cc: Daniel Vetter <daniel@xxxxxxxx> > > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@xxxxxxxxx> > > Cc: Haneen Mohammed <hamohammed.sa@xxxxxxxxx> > > Signed-off-by: Melissa Wen <melissa.srw@xxxxxxxxx> > > --- > > drivers/gpu/drm/vkms/vkms_crtc.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c > > index ac85e17428f8..a99d6b4a92dd 100644 > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c > > @@ -246,6 +246,7 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc, > > > > spin_unlock(&crtc->dev->event_lock); > > > > + drm_crtc_vblank_put(crtc); > > Uh so I reviewed this a bit more carefully now, and I dont think this is > the correct bugfix. From the kerneldoc of drm_crtc_arm_vblank_event(): > > * Caller must hold a vblank reference for the event @e acquired by a > * drm_crtc_vblank_get(), which will be dropped when the next vblank arrives. > > So when we call drm_crtc_arm_vblank_event then the vblank_put gets called > for us. And that's the only case where we successfully acquired a vblank > interrupt reference since on failure of drm_crtc_vblank_get (0 indicates > success for that function, failure negative error number) we directly send > out the event. > > So something else fishy is going on, and now I'm totally confused why this > even happens. > > We also have a pile of WARN_ON checks in drm_crtc_vblank_put to make sure > we don't underflow the refcount, so it's also not that I think (except if > this patch creates more WARNING backtraces). > > But clearly it changes behaviour somehow ... can you try to figure out > what changes? Maybe print out the vblank->refcount at various points in > the driver, and maybe also trace when exactly the fake vkms vblank hrtimer > is enabled/disabled ... :( I can check these, but I also have other suspicions. When I place the drm_crct_vblank_put out of the if (at the end of flush), it not only solve the issue of blocking on kms_cursor_crc, but also the WARN_ON on kms_flip doesn't appear anymore (a total cleanup). Just after: vkms_output->composer_state = to_vkms_crtc_state(crtc->state); looks like there is something stuck around here. Besides, there is a lock at atomic_begin: /* This lock is held across the atomic commit to block vblank timer * from scheduling vkms_composer_worker until the composer is updated */ spin_lock_irq(&vkms_output->lock); that seems to be released on atomic_flush and make me suspect something missing on the composer update. I'll check all these things and come back with news (hope) :) Thanks, Melissa > > I'm totally confused about what's going on here now. > -Daniel > > > crtc->state->event = NULL; > > } > > > > -- > > 2.27.0 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel