[PATCH] drm/i915: Use MLC (l3$) for context objects

daniel at ffwll.ch (Daniel Vetter) · Mon, 8 Apr 2013 17:12:27 +0200

On Mon, Apr 08, 2013 at 07:57:40AM -0700, Kenneth Graunke wrote:
> On 04/08/2013 06:28 AM, Chris Wilson wrote:
> >Enabling context support increases SwapBuffers latency by about 20%
> >(measured on an i7-3720qm). We can offset that loss slightly by enabling
> >faster caching for the contexts. As they are not backed by any
> >particular cache (such as the sampler or render caches) our only option
> >is to select the generic mid-level cache. This reduces the latency of
> >the swap by about 5%.
> >
> >Oddly this effect can be observed running smokin-guns on IVB at
> >1280x1024:
> >Using BLT copies for swaps: 151.67 fps
> >Using Render copies for swaps (unpatched):  141.70 fps
> >With contexts disabled: 150.23 fps
> >With contexts in L3$: 150.77 fps
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Ben Widawsky <ben at bwidawsk.net>
> >Cc: Kenneth Graunke <kenneth at whitecape.org>
> >---
> >  drivers/gpu/drm/i915/i915_gem_context.c |    7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >index 94d873a..a1e8ecb 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_context.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >@@ -152,6 +152,13 @@ create_hw_context(struct drm_device *dev,
> >  		return ERR_PTR(-ENOMEM);
> >  	}
> >
> >+	if (INTEL_INFO(dev)->gen >= 7) {
> >+		ret = i915_gem_object_set_cache_level(ctx->obj,
> >+						      I915_CACHE_LLC_MLC);
> >+		if (ret)
> >+			goto err_out;
> >+	}
> >+
> >  	/* The ring associated with the context object is handled by the normal
> >  	 * object tracking code. We give an initial ring value simple to pass an
> >  	 * assertion in the context switch code.
> 
> Sounds good to me, Chris.  Is this also useful on Sandybridge?  I
> don't see why we couldn't use both L3 & LLC there as well.

Snb doesn't have gpu l3 afaik.
> 
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> 
> On a tangent, it would be nice to clean up the I915_CACHE_* enums.
> The term "MLC" isn't really used in the modern documentation, and
> there's even text that says "Ivybridge doesn't have MLC".  But what
> it really means here is LLC + L3.  It's probably not worth reworking
> until we can send out the new Haswell bits, though.

Yeah, my idea is that iternally we split this into cache_level + flags,
where cache_level controls how coherency works (i.e. where we have to
clflush and do similar magic) and the flags do the fine-tuning (like
enabling l3 caching). But like you've said, can wait for more hsw magic
(or gfdt) landing.

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch