On Tue, Jun 07, 2016 at 02:31:07PM +0300, Joonas Lahtinen wrote: > On pe, 2016-06-03 at 17:36 +0100, Chris Wilson wrote: > > i915_gem_idle_work_handler(struct work_struct *work) > > { > > struct drm_i915_private *dev_priv = > > - container_of(work, typeof(*dev_priv), mm.idle_work.work); > > + container_of(work, typeof(*dev_priv), gt.idle_work.work); > > struct drm_device *dev = dev_priv->dev; > > struct intel_engine_cs *engine; > > > > - for_each_engine(engine, dev_priv) > > - if (!list_empty(&engine->request_list)) > > - return; > > + if (!READ_ONCE(dev_priv->gt.awake)) > > + return; > > > > - /* we probably should sync with hangcheck here, using cancel_work_sync. > > - * Also locking seems to be fubar here, engine->request_list is protected > > - * by dev->struct_mutex. */ > > + mutex_lock(&dev->struct_mutex); > > + if (dev_priv->gt.active_engines) > > + goto out; > > > > - intel_mark_idle(dev_priv); > > + for_each_engine(engine, dev_priv) > > + i915_gem_batch_pool_fini(&engine->batch_pool); > > > > - if (mutex_trylock(&dev->struct_mutex)) { > > - for_each_engine(engine, dev_priv) > > - i915_gem_batch_pool_fini(&engine->batch_pool); > > + GEM_BUG_ON(!dev_priv->gt.awake); > > + dev_priv->gt.awake = false; > > > > - mutex_unlock(&dev->struct_mutex); > > + if (INTEL_INFO(dev_priv)->gen >= 6) > > + gen6_rps_idle(dev_priv); > > + intel_runtime_pm_put(dev_priv); > > +out: > > + mutex_unlock(&dev->struct_mutex); > > + > > + if (!dev_priv->gt.awake && > > No READ_ONCE here, even we just unlocked the mutex. So lacks some > consistency. > > Also, this assumes we might be pre-empted between unlocking mutex and > making this test, so I'm little bit confused. Do you want to optimize > by avoiding calling cancel_delayed_work_sync? General principle to never call work_sync functions with locks held. I had actually thought I had fixed this up (but realized that I just rewrote hangcheck later on instead ;) Ok, what I think is safer here is bool hangcheck = cancel_delay_work_sync(hangcheck_work) mutex_lock() if (actually_idle()) { awake = false; missed_irq_rings |= intel_kick_waiters(); } mutex_unlock(); if (awake && hangcheck) queue_hangcheck() So always kick the hangcheck and reeanble if we tried to idle too early. This will potentially delay hangcheck by one full hangcheck period if we do encounter that race. But we shouldn't be hitting this race that often, or hanging the GPU for that mterr. > > index 166f1a3829b0..d0cd9a1aa80e 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > @@ -372,13 +372,13 @@ struct intel_engine_cs { > > }; > > > > static inline bool > > -intel_engine_initialized(struct intel_engine_cs *engine) > > +intel_engine_initialized(const struct intel_engine_cs *engine) > > { > > return engine->i915 != NULL; > > } > > > > static inline unsigned > > -intel_engine_flag(struct intel_engine_cs *engine) > > +intel_engine_flag(const struct intel_engine_cs *engine) > > { > > return 1 << engine->id; > > } > > I think majority of our functions are not const-correct, I remember > some grunting on the subject when I tried to change some to be. But I'm > all for it myself. Not yet, a few more gradual drive bys and we'll be in position for a grander scheme of correctness ;) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx