On 01/11/2016 01:16 AM, Chris Wilson wrote: > Ideally, we want to automagically have the GPU respond to the > instantaneous load by reclocking itself. However, reclocking occurs > relatively slowly, and to the client waiting for a result from the GPU, > too late. To compensate and reduce the client latency, we allow the > first wait from a client to boost the GPU clocks to maximum. This > overcomes the lag in autoreclocking, at the expense of forcing the GPU > clocks too high. So to offset the excessive power usage, we currently > allow a client to only boost the clocks once before we detect the GPU > is idle again. This works reasonably for say the first frame in a > benchmark, but for many more synchronous workloads (like OpenCL) we find > the GPU clocks remain too low. By noting a wait which would idle the GPU > (i.e. we just waited upon the last known request), we can give that > client the idle boost credit (for their next wait) without the 100ms > delay required for us to detect the GPU idle state. The intention is to > boost clients that are stalling in the process of feeding the GPU more > work (and who in doing so let the GPU idle), without granting boost > credits to clients that are throttling themselves (such as compositors). > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: "Zou, Nanhai" <nanhai.zou@xxxxxxxxx> > Cc: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> > Reviewed-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index e9f5ca7ea835..3fea582768e9 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1314,6 +1314,22 @@ complete: > *timeout = 0; > } > > + if (ret == 0 && rps && req->seqno == req->ring->last_submitted_seqno) { > + /* The GPU is now idle and this client has stalled. > + * Since no other client has submitted a request in the > + * meantime, assume that this client is the only one > + * supplying work to the GPU but is unable to keep that > + * work supplied because it is waiting. Since the GPU is > + * then never kept fully busy, RPS autoclocking will > + * keep the clocks relatively low, causing further delays. > + * Compensate by giving the synchronous client credit for > + * a waitboost next time. > + */ > + spin_lock(&req->i915->rps.client_lock); > + list_del_init(&rps->link); > + spin_unlock(&req->i915->rps.client_lock); > + } > + > return ret; > } > > Assuming this works for the OCL guys, it seems ok. Doing the list_del_init(&rps->link) is a bit of an obfuscated way of doing it, but I guess the comment makes it pretty clear. Jesse _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx