On 24/03/16 12:27, Chris Wilson wrote:
On Thu, Mar 24, 2016 at 11:37:07AM +0000, Tvrtko Ursulin wrote:
On 23/03/16 16:40, Chris Wilson wrote:
On Wed, Mar 23, 2016 at 04:24:48PM +0000, Tvrtko Ursulin wrote:
Biggest thing to make sure is that you don't add a lot of cycles to
the forcewake loops since for example fw_domains_get can be the
hottest i915 function on some benchmarks.
(This area slightly annoys me anyway with redundant looping over
forcewake domains and we could also potentially optimize the ack
waiting by first requesting all we want, and then doing the waits.
That would be one additional loop, but if removed the other one,
code would stay at the same number of domain loops.)
I hear you. I just end up weeping in the corner when I see fw_domain_get
on the profile.
We already do have a mitigation scheme to hold onto the forcewake for an
extra jiffie every time. I don't like it, but without it fw_domains_get
becomes a real hog.
I am pretty sure I've seen some tests which somehow defeat the
jiffie delay and we end up re-acquiring every ms/jiffie. This is
something I wanted to get to the bottom of but did not get round to
yet. It was totally unexpected because the test is hammering on
everything.
Absolutely sure it is not just the delay in acquiring the ack? And
spinning on waiting for the thread_c0 doesn't come cheap? I've just
written off fw_domain_get being high on the profiles simply due to that
we have to spin so long (I'm jaded because on Sandybridge spinning for
50us+ isn't uncommon iirc).
I am not sure, I just know I had a printk in the timer release and it
was firing every millisecond which completely perplexed me since I was
running gem_exec_nop/all at the time.
Good point on that the cost might actually be in the wait for acks.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx