[PATCH 1/2] drm/i915: Align the hangcheck wakeup to the nearest second

chris at chris-wilson.co.uk (Chris Wilson) · Fri, 05 Oct 2012 16:51:08 +0100



On Fri, 05 Oct 2012 18:40:05 +0300, Jani Nikula <jani.nikula at linux.intel.com> wrote:
> On Fri, 05 Oct 2012, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > round_jiffies() aligns the wakeup time to the nearest second in order to
> > batch wakeups and reduce system load, which is useful for unimportant
> > coarse timers like our hangcheck.
> >
> > Suggested-by: Arjan van de Ven <arjan at linux.intel.com>
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Arjan van de Ven <arjan at linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h |    1 +
> >  drivers/gpu/drm/i915/i915_gem.c |    3 +--
> >  drivers/gpu/drm/i915/i915_irq.c |    5 ++---
> >  3 files changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index d8043af..f79c664 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -460,6 +460,7 @@ typedef struct drm_i915_private {
> >  
> >  	/* For hangcheck timer */
> >  #define DRM_I915_HANGCHECK_PERIOD 1500 /* in ms */
> > +#define DRM_I915_HANGCHECK_JIFFIES msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD)
> >  	struct timer_list hangcheck_timer;
> >  	int hangcheck_count;
> >  	uint32_t last_acthd[I915_NUM_RINGS];
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index c78f8e3..8e05d53 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2151,8 +2151,7 @@ i915_add_request(struct intel_ring_buffer *ring,
> >  	if (!dev_priv->mm.suspended) {
> >  		if (i915_enable_hangcheck) {
> >  			mod_timer(&dev_priv->hangcheck_timer,
> > -				  jiffies +
> > -				  msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD));
> > +				  round_jiffies_relative(DRM_I915_HANGCHECK_JIFFIES));
> 
> What is DRM_I915_HANGCHECK_PERIOD based on; specifically is it a strict
> minimum value? Should round_jiffies_*up*_relative() be used instead?

It's a random value plucked out of the air for being long enough that
any typical batch will have completed and short enough that the user
doesn't turn the machine off in digust. Typically we repeat the
hangcheck as well to confirm that GPU is dead before starting the error
recovery. Rounding up is slightly preferable.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre