cool, thanks for the detailed explanation. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> On Fri, Feb 27, 2015 at 6:04 AM, Daniel Vetter <daniel@xxxxxxxx> wrote: > On Thu, Feb 26, 2015 at 05:11:16PM -0800, Rodrigo Vivi wrote: >> I believe this patch is on the wrong series, right? > > It's in here since I've spotted the FIXME while removing ums crap. > >> I'm afraid I don't know what was this race neither the two-step reset >> to be able to review this comment remove. >> Please give me some pointers to check that. > > Let me explain the history a bit. git blame on the various parts and this > fixme should be able to dig out the details (it's a fun story): > > Originally we've had an unconditional drm_irq_install/unistall in the > reset code. Which is not cool since it meant we'd kill all the interrutps > that have been going on, so pageflips, vblank waits, crc checksums, gem > waits all stopped working. This is the bug the FIXME is about. > > With fixed most of these issues by no longer disabling/enabling interrupts > driver-wide, but only restoring the interrupt bits on the gt (they get > lost in the reset). That takes care of all the modeset interrupts. > > The gem waits have been fixed differently and much earlier with the > 2-stage reset code: > > - before reset we set a flag RESET_IN_PROGRESS and wake up all waiters. > > - after reset we clear that flag by incrementing the reset counter and > again wake all waiters > > Waiters always check this flag and the reset counter every time they are > woken and bail out with -EINTR (to restart the entire ioctl) if that's the > case. That means they'll never miss a reset and so won't be affected by > interrupts suddenly being cleared. > > I've simply forgotten to remove the FIXME ;-) > > Cheers, Daniel >> >> >> On Mon, Feb 23, 2015 at 3:03 AM, Daniel Vetter <daniel.vetter@xxxxxxxx> wrote: >> > With the two-step reset counter increments which braket the actual >> > reset code and the subsequent wake-up we're guaranteeing that all the >> > lockless waiters _will_ be woken up. And since we unconditionally bail >> > out of waits with -EAGAIN (or -EIO) in that case there is not risk of >> > lost interrupt enabling bits when the lockless wait code races against >> > a gpu reset. >> > >> > Let's remove this FIXME as resolved then. >> > >> > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> >> > --- >> > drivers/gpu/drm/i915/i915_drv.c | 6 ------ >> > 1 file changed, 6 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c >> > index cc6c51107047..89741e6e2d08 100644 >> > --- a/drivers/gpu/drm/i915/i915_drv.c >> > +++ b/drivers/gpu/drm/i915/i915_drv.c >> > @@ -878,12 +878,6 @@ int i915_reset(struct drm_device *dev) >> > } >> > >> > /* >> > - * FIXME: This races pretty badly against concurrent holders of >> > - * ring interrupts. This is possible since we've started to drop >> > - * dev->struct_mutex in select places when waiting for the gpu. >> > - */ >> > - >> > - /* >> > * rps/rc6 re-init is necessary to restore state lost after the >> > * reset and the re-install of gt irqs. Skip for ironlake per >> > * previous concerns that it doesn't respond well to some forms >> > -- >> > 2.1.4 >> > >> > _______________________________________________ >> > Intel-gfx mailing list >> > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx >> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx >> >> >> >> -- >> Rodrigo Vivi >> Blog: http://blog.vivi.eng.br > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Rodrigo Vivi Blog: http://blog.vivi.eng.br _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx