hsw rps values regress RPS on Macbook Air

jbarnes at virtuousgeek.org (Jesse Barnes) · Tue, 16 Oct 2012 06:53:45 -0700

On Fri, 12 Oct 2012 11:34:08 -0700
Eric Anholt <eric at anholt.net> wrote:

> Jesse Barnes <jbarnes at virtuousgeek.org> writes:
> 
> > On Tue, 09 Oct 2012 13:05:54 -0700
> > Eric Anholt <eric at anholt.net> wrote:
> >
> >> On my new MBA with danvet's drm-intel-next-queued, I'm not getting
> >> working RPS.  vblank_mode=0 glxgears never ups the frequency, and
> >> vblank_mode=0 openarena only makes it up to 500mhz.  Reverting
> >> 1ee9ae3244c4789f3184c5123f3b2d7e405b3f4c gets the machine to responsive
> >> RPS: fully on while the GPU is busy, fully lowered when it's not.
> >> 
> >> Since we're always just looking for all-on or all-off and never see
> >> workloads that actually want to be somewhere in between, could we please
> >> just move to race to idle for RPS?
> >
> > Ramping to the max freq is fine for benchmarking.  But for normal
> > vblank throttled activity, using the lowest freq (assuming it's
> > above our nominal freq) that can hit the refresh is the right answer
> > from a power perspective.
> 
> Have you seen any workloads where a middle frequency value is actually
> chosen by the current RPS system?

I can't tell if this is a snarky response or not. :)  But either way it
misses my point: I think the current RPS system isn't ideal for many of
our workloads and the way our GL stack runs things.  I've thought we
could do better for awhile now but couldn't think of a way that would
let userspace request lower frequencies if it didn't need the extra
processing power, but if we collect a little data in Mesa maybe we can
do it.

I propose a new ioctl, I915_FREQ_REQUEST, with 3 different parameters,
I915_MAX_FREQ, I915_MORE_FREQ, and I915_LESS_FREQ.  The first would
tell the kernel the app would like to run at the maximum possible
speed, regardless of power or throttling considerations.  MORE
would simply tell the kernel the app needs a higher frequency to meet
its frame rate target, and LESS would tell the kernel it could run
slower and still hit its target.

In Mesa, we'd need to track the FPS target for the app, the current FPS
(e.g. over the last second, or using a decaying average with some
weight toward recent activity), and the time between swapbuffers calls
(as an approximation of how long it takes us to draw each frame).

Periodically (maybe every second when we update our current FPS), Mesa
would either request more frequency if it wasn't hitting its FPS
target, or less frequency if its frame draw time was less than 90% of
the maximum alloted frame time (the period for the frequency we're
trying to hit).  The FPS target would be based on the swap interval for
the app.

In a benchmarking mode (i.e. vblank_mode=0 or swapinterval set to 0),
we could just make a I915_MAX_FREQ request and be done with it.

Within the kernel, we'd evaluate every app's requests and choose the
max frequency requested, re-setting things on every ioctl call and when
apps close.

Any thoughts?  Would collecting the above info in Mesa be pretty easy?
I think we already collect FPS info if certain debug flags are set, and
frame time seems like a pretty trivial calculation based on some
timestamping in a couple of places...

Thanks,
Jesse