On Thu, 1 Nov 2012 08:34:47 -0700, Jesse Barnes <jbarnes at virtuousgeek.org> wrote: > On Thu, 1 Nov 2012 16:29:35 +0100 > Daniel Vetter <daniel at ffwll.ch> wrote: > > > On Thu, Nov 01, 2012 at 03:18:46PM +0000, Chris Wilson wrote: > > > On Thu, 1 Nov 2012 08:07:59 -0700, Jesse Barnes <jbarnes at virtuousgeek.org> wrote: > > > > On Thu, 1 Nov 2012 09:26:26 +0000 > > > > Chris Wilson <chris at chris-wilson.co.uk> wrote: > > > > > > > > > If we accumulate unpin tasks because we are pageflipping faster than the > > > > > system can schedule its workers, we can effectively create a > > > > > pin-leak. The solution taken here is to limit the number of unpin tasks > > > > > we have per-crtc and to flush those outstanding tasks if we accumulate > > > > > too many. This should prevent any jitter in the normal case, and also > > > > > prevent the hang if we should run too fast. > > > > > > > > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=46991 > > > > > Reported-and-tested-by: Tvrtko Ursulin <tvrtko.ursulin at onelan.co.uk> > > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > > > > > --- > > > > > drivers/gpu/drm/i915/intel_display.c | 22 ++++++++++++++++------ > > > > > drivers/gpu/drm/i915/intel_drv.h | 4 +++- > > > > > 2 files changed, 19 insertions(+), 7 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > > > > > index 69b1739..800b195 100644 > > > > > --- a/drivers/gpu/drm/i915/intel_display.c > > > > > +++ b/drivers/gpu/drm/i915/intel_display.c > > > > > @@ -6908,14 +6908,19 @@ static void intel_unpin_work_fn(struct work_struct *__work) > > > > > { > > > > > struct intel_unpin_work *work = > > > > > container_of(__work, struct intel_unpin_work, work); > > > > > + struct drm_device *dev = work->crtc->dev; > > > > > > > > > > - mutex_lock(&work->dev->struct_mutex); > > > > > + mutex_lock(&dev->struct_mutex); > > > > > intel_unpin_fb_obj(work->old_fb_obj); > > > > > drm_gem_object_unreference(&work->pending_flip_obj->base); > > > > > drm_gem_object_unreference(&work->old_fb_obj->base); > > > > > > > > > > - intel_update_fbc(work->dev); > > > > > - mutex_unlock(&work->dev->struct_mutex); > > > > > + intel_update_fbc(dev); > > > > > + mutex_unlock(&dev->struct_mutex); > > > > > + > > > > > + BUG_ON(atomic_read(&to_intel_crtc(work->crtc)->unpin_work_count) == 0); > > > > > + atomic_dec(&to_intel_crtc(work->crtc)->unpin_work_count); > > > > > + > > > > > kfree(work); > > > > > } > > > > > > > > > > @@ -6963,9 +6968,9 @@ static void do_intel_finish_page_flip(struct drm_device *dev, > > > > > > > > > > atomic_clear_mask(1 << intel_crtc->plane, > > > > > &obj->pending_flip.counter); > > > > > - > > > > > wake_up(&dev_priv->pending_flip_queue); > > > > > - schedule_work(&work->work); > > > > > + > > > > > + queue_work(dev_priv->wq, &work->work); > > > > > > > > > > trace_i915_flip_complete(intel_crtc->plane, work->pending_flip_obj); > > > > > } > > > > > @@ -7266,7 +7271,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, > > > > > return -ENOMEM; > > > > > > > > > > work->event = event; > > > > > - work->dev = crtc->dev; > > > > > + work->crtc = crtc; > > > > > intel_fb = to_intel_framebuffer(crtc->fb); > > > > > work->old_fb_obj = intel_fb->obj; > > > > > INIT_WORK(&work->work, intel_unpin_work_fn); > > > > > @@ -7291,6 +7296,9 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, > > > > > intel_fb = to_intel_framebuffer(fb); > > > > > obj = intel_fb->obj; > > > > > > > > > > + if (atomic_read(&intel_crtc->unpin_work_count) >= 2) > > > > > + flush_workqueue(dev_priv->wq); > > > > > + > > > > > > > > Have you by chance tested this with the async flip patch? I wonder if > > > > in that case whether 2 is too small, and something like 100 might be > > > > better (though really async flips are for cases where we can't keep up > > > > with refresh, so a small number shouldn't hurt too much there either). > > > > > > The limit on 2 is due to the limited resolution of pincount. Hence my > > > earlier fear for your async flip patch. > > > > I think for asyn flips we simply need to have a real flip queue in our > > code, instead of abusing the implicit list in the workqueue code ... > > > > One other thing is that with async flips we don't have a natural limit on > > the number of pinned framebuffers any more, which means we can easily > > exhaust all mappable GTT space. Hence we need to integrate that new, > > explicit flip queue into our eviction code, too. > > > > For now I'm rather happy with the flush_wq ducttape presented here ;-) > > Yeah I don't have a problem with it as long as we don't block when > queuing flips in real life. :) Actually I've justified the blocking here to myself, and prefer it to simply running the crtc->unpin_work. If userspace is swamping the system so badly that we can run the kthreads quick enough, it deserves a stall. Note that the unpin leak is still about the 3rd most common bug in fedora, so this stall will be forced on many machines. -Chris -- Chris Wilson, Intel Open Source Technology Centre