MSR for November / December

Ian Romanick <ian.d.romanick@xxxxxxxxx> · Fri, 05 Dec 2014 18:45:27 -0800

Short version:

- Khronos face-to-face
- BYT performance work

Longer version:

Yet another Khronos face-to-face meeting.  This was a special meeting
just for the gl_common working group to hammer out details of XGL (still
need a name!) so that we can at least have a chance of having a
provisional spec for GDC in March.  We made excellent progress on some
of the tougher issues, and I think there may actually be a chance of
having a usable spec by GDC.

There are still some major sticking points.  From my POV, the biggest
issue is that tile-based renderers (TBR) need some additional
information and / or limitations that immediate-mode renderers (IMR) do
not.  At best an IMR would just drop the data on the floor.  At worst an
IMR would lose performance due to extra state transitions.  On the
second day of the meetings there was a very heated 2 hour "debate" about
this issue.  I think the only thing that came out of it was the obvious
conclusion that app developers need to specifically optimize
applications for TBRs.

While not in meetings or on airplanes, I spent some time looking at
where Mesa spends CPU.  As soon as I started looking, I couldn't not
find problems.  I have about 30 patches of potential micro-optimizations
across the glUniform paths and the draw-time validation paths.  I'd love
to send these out, but I'm having quite a bit of difficulty getting
meaningful performance data to justify the changes.

I have tried several techniques, and I'm not terribly pleased with any
of them.  The most useful have been:

- Use callgrind to get instruction cycle counts.  This provides stable
results, but it's sloooooooooooooow.  It also doesn't account for the
shared memory bus or power management interactions.

- Use a CPU-limited benchmark and measure its framerate.  This provides
full-system data, but the results aren't stable.  This means you have to
do many runs to detect small performance changes.  Very small changes
may just be undetectable.

I wish I had something like shader-db for measuring CPU changes. :(

It's worth noting that some of the SynMark2 timings were completely
botched.  I changed my build scripts to use the same compiler
optimization settings as a distro (I picked Fedora) so that I could get
results more representative of what real users would see.  In the
process, I accidentally left --enable-debug in the configure command.
This left assertions enabled in the code, and, yeah, that affects
performance.

Anyway... a few housekeeping patches have already been sent to mesa-dev,
and the rest should go out before the end of the year.

Next month:

All the travel.  LCA, another Khronos meeting, and FOSDEM.  I'm also
taking a week of vacation in January, and sleeping for most of February.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx