On Wed, Mar 23, 2016 at 04:24:48PM +0000, Tvrtko Ursulin wrote: > Biggest thing to make sure is that you don't add a lot of cycles to > the forcewake loops since for example fw_domains_get can be the > hottest i915 function on some benchmarks. > > (This area slightly annoys me anyway with redundant looping over > forcewake domains and we could also potentially optimize the ack > waiting by first requesting all we want, and then doing the waits. > That would be one additional loop, but if removed the other one, > code would stay at the same number of domain loops.) I hear you. I just end up weeping in the corner when I see fw_domain_get on the profile. We already do have a mitigation scheme to hold onto the forcewake for an extra jiffie every time. I don't like it, but without it fw_domains_get becomes a real hog. Note that one thing we can actually do is restrict the domains we wakeup for the engines (engine->fw_domain) in execlists_submit, that should help chv/skl+ a small amount. I don't have a good idea for how to keep rc6 residency high but avoid forcewake when those darn elsp require forcewake. As does gen6+ legacy RING_TAIL writes. And even then that spinlock causes quite a bit of traffic when it shouldn't be contended. I've been thinking of whether we can have multiple locks (hashed by register) but we would then still need some cross-communication for the common forcewake. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx