On 07/07/2014 20:14, Daniel Vetter wrote:
On Thu, Jun 26, 2014 at 06:24:06PM +0100, John.C.Harrison@xxxxxxxxx wrote:From: John Harrison <John.C.Harrison@xxxxxxxxx> The scheduler needs to do interrupt triggered work that is too complex to do in the interrupt handler. Thus it requires a deferred work handler to process this work asynchronously. --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 10 ++++++++++ drivers/gpu/drm/i915/i915_gem.c | 27 +++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_scheduler.c | 7 +++++++ drivers/gpu/drm/i915/i915_scheduler.h | 1 + 5 files changed, 48 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 1668316..d1356f3 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1813,6 +1813,9 @@ int i915_driver_unload(struct drm_device *dev) WARN_ON(unregister_oom_notifier(&dev_priv->mm.oom_notifier)); unregister_shrinker(&dev_priv->mm.shrinker);+ /* Cancel the scheduler work handler, which should be idle now. */+ cancel_work_sync(&dev_priv->mm.scheduler_work); + io_mapping_free(dev_priv->gtt.mappable); arch_phys_wc_del(dev_priv->gtt.mtrr);diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.hindex 0977653..fbafa68 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1075,6 +1075,16 @@ struct i915_gem_mm { struct delayed_work idle_work;/**+ * New scheme is to get an interrupt after every work packet + * in order to allow the low latency scheduling of pending + * packets. The idea behind adding new packets to a pending + * queue rather than directly into the hardware ring buffer + * is to allow high priority packets to over take low priority + * ones. + */ + struct work_struct scheduler_work;Latency for work items isn't too awesome, and e.g. Oscar's execlist code latches the next context right away from the irq handler. Why can't we do something similar for the scheduler? Fishing the next item out of a priority queue shouldn't be expensive ... -Daniel
The problem is that taking batch buffers from the scheduler's queue and submitting them to the hardware requires lots of processing that is not IRQ compatible. It isn't just a simple register write. Half of the code in 'i915_gem_do_execbuffer()' must be executed. Probably/possibly it could be made IRQ friendly but that would place a lot of restrictions on a lot of code that currently doesn't expect to be restricted. Instead, the submission is done via a work handler that acquires the driver mutex lock.
In order to cover the extra latency, the scheduler operates in a multi-buffered mode and aims to keep eight batch buffers in flight at all times. That number being obtained empirically by running lots of benchmarks on Android with lots of different settings and seeing where the buffer size stopped making a difference.
John.
+ + /** * Are we in a non-interruptible section of code like * modesetting? */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fece5e7..57b24f0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2712,6 +2712,29 @@ i915_gem_idle_work_handler(struct work_struct *work) intel_mark_idle(dev_priv->dev); }+#ifdef CONFIG_DRM_I915_SCHEDULER+static void +i915_gem_scheduler_work_handler(struct work_struct *work) +{ + struct intel_engine_cs *ring; + struct drm_i915_private *dev_priv; + struct drm_device *dev; + int i; + + dev_priv = container_of(work, struct drm_i915_private, mm.scheduler_work); + dev = dev_priv->dev; + + mutex_lock(&dev->struct_mutex); + + /* Do stuff: */ + for_each_ring(ring, dev_priv, i) { + i915_scheduler_remove(ring); + } + + mutex_unlock(&dev->struct_mutex); +} +#endif + /** * Ensures that an object will eventually get non-busy by flushing any required * write domains, emitting any outstanding lazy request and retiring and @@ -4916,6 +4939,10 @@ i915_gem_load(struct drm_device *dev) i915_gem_retire_work_handler); INIT_DELAYED_WORK(&dev_priv->mm.idle_work, i915_gem_idle_work_handler); +#ifdef CONFIG_DRM_I915_SCHEDULER + INIT_WORK(&dev_priv->mm.scheduler_work, + i915_gem_scheduler_work_handler); +#endif init_waitqueue_head(&dev_priv->gpu_error.reset_queue);/* On GEN3 we really need to make sure the ARB C3 LP bit is set */diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 66a6568..37f8a98 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -58,6 +58,13 @@ int i915_scheduler_init(struct drm_device *dev) return 0; }+int i915_scheduler_remove(struct intel_engine_cs *ring)+{ + /* Do stuff... */ + + return 0; +} + bool i915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, uint32_t seqno, bool *completed) { diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 95641f6..6b2cc51 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -38,6 +38,7 @@ struct i915_scheduler { uint32_t index; };+int i915_scheduler_remove(struct intel_engine_cs *ring);bool i915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, uint32_t seqno, bool *completed);--1.7.9.5 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx