On Thu, 20 Dec 2012 17:57:04 -0800 Ben Widawsky <ben at bwidawsk.net> wrote: > On Thu, Dec 20, 2012 at 03:34:11PM -0800, Jesse Barnes wrote: > > On Wed, 24 Oct 2012 12:05:35 -0700 > > Ben Widawsky <ben at bwidawsk.net> wrote: > > > > > On Sun, 21 Oct 2012 15:44:02 +0100 > > > Chris Wilson <chris at chris-wilson.co.uk> wrote: > > > > > > > Even though we do not use the EI mode for determining when to > > > > change GPU frequencies for RPS, changing this value causes no > > > > up interrupts to be generated whilst an OpenGL client runs. > > > > > > > > Fixes regression from commit > > > > 1ee9ae3244c4789f3184c5123f3b2d7e405b3f4c Author: Daniel Vetter > > > > <daniel.vetter at ffwll.ch> Date: Wed Aug 15 10:41:45 2012 +0200 > > > > > > > > drm/i915: use hsw rps tuning values everywhere on gen6+ > > > > > > > > Reported-by: Eric Anholt <eric at anholt.net> > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > > > > Cc: Eric Anholt <eric at anholt.net> > > > > Cc: Ben Widawsky <ben at bwidawsk.net> > > > > Cc: Daniel Vetter <daniel.vetter at ffwll.ch> > > > > Cc: stable at vger.kernel.org > > > > --- > > > > drivers/gpu/drm/i915/intel_pm.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c > > > > b/drivers/gpu/drm/i915/intel_pm.c index 81e88c2..15b585e 100644 > > > > --- a/drivers/gpu/drm/i915/intel_pm.c > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c > > > > @@ -2493,7 +2493,7 @@ static void gen6_enable_rps(struct > > > > drm_device *dev) > > > > I915_WRITE(GEN6_RP_UP_THRESHOLD, 59400); > > > > I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 245000); > > > > - I915_WRITE(GEN6_RP_UP_EI, 66000); > > > > + I915_WRITE(GEN6_RP_UP_EI, IS_HASWELL(dev) ? 66000 : > > > > 100000); I915_WRITE(GEN6_RP_DOWN_EI, 350000); > > > > > > > > I915_WRITE(GEN6_RP_IDLE_HYSTERSIS, 10); > > > > > > > > > > > > > > > I've not verified this with interrupts, but simply reading back > > > the current frequency using the sysfs interface. > > > > > > What I've seen running xonotic on Ivybridge is that we do bump the > > > frequency initially, and then stops. Nearing the end of the demo, > > > we again raise the frequency. Note that on my system, both > > > before, and after this patch, I am able to get to the max GPU > > > frequency with the xonotic demo. > > > > > > Specifically, on my IVB which has a range of 350->1100 with an > > > RP1 of 650. I see the following (the demo is roughly 2 minutes) > > > > > > without patch: > > > Within a few seconds we cycle up to 750 > > > Nothing for about 30 seconds > > > very slowly cycle up to 1100 (*just* before the demo ends) > > > demo ends; throttle down to 350 quickly > > > > > > with patch: > > > Within a few seconds we cycle up to 1000 > > > Nothing for about 30 seconds > > > cycle up to 1100 > > > demo ends; throttle down to 350 slowly > > > > > > I think if this fixes someones critical issue, it's great, but > > > unfortunately I do not see the problem the patch claims to fix. > > > Furthermore, none of us can really make sense of why this has the > > > effect that it does, but I believe a lot of that is because the > > > workloads we run (in this case xonotic) are very blackbox. > > > > > > Personally, on this IVB, I think the behavior before the patch is > > > more desirable because it stays near RP1 for a longer period of > > > time, and drops to RP0 quickly (but it's definitely a matter of > > > opinion). > > > > It's a balance between power and performance. Running at the higher > > freqs is definitely less power efficient, but without this patch we > > definitely have a performance regression (running at 750 instead of > > 1000 MHz for most of the demo, plus the cases Chris saw). > > > > But we also don't want to prevent RC6 entry on vulnerable systems, > > so maybe we need two sets of values or a broader set of changes > > that work better for a swath of workloads. > > To me this sounds like an idea which is better on paper than in > practice. Unless we can generalize the various sets of values with > knobs for users to pick from, it will really be overwhelming (I > feel), and will probably provide more bad data than good. Fine > grained knobs are already given via registers, and really pestering > users can toy with those, if they want. Perhaps it's on us to add > better comments to i915_reg.h to explain how we think it works a bit > better? > > > > > Also, the initial patch to use the HSW values implies that the > > values are applicable across generations. They're not. They're > > very specific to a given generation and potentially even different > > versions of a given generation, so sharing the values is probably a > > bad idea in general. > > > > -- > > Jesse Barnes, Intel Open Source Technology Center > > To set the record straight, I am pretty sure we can agree on a few > things. > 1. We don't really know wtf. > 2. There will be no globally good set of values. > 2a. There may even be no globally decent set of values. > 3. 1 workload is insufficient to determine anything > > If a workload is obviously too GPU render heavy, I think a decent > solution is to just use the sysfs parameters to force a specific min. > We don't really need to be skittish about setting this as really > *most* things are too GPU render heavy for us at the moment. For the > general case I'll assert the performance 'profile' is the right thing > unless someone proves otherwise, and here we just want the set of > values which throttle down faster than they throttle up. > In the email in my head, that said "the power 'profile'" > I think there is still some value in the proof of concept stuff you > were hoping to do around a system wide GPU power governor thing. > Maybe we can just set the RPS autothrottle values to be as dumb as > possible, and then use that governor to set the min/max frequencies > as it sees fit. Another similar option is to try to use SW RC6. If we > had an intern, I'd even suggest to pick some decent ranges for all > the thresholds and have that intern run benchmarks with all the > different values. > > Okay, done wasting your time now..