Clarification on the issue. Consider that you have a massive load on GT and just tiny one on IA. If GT will program the RING frequency to be lower than IA frequency, then you will fall into the situation when RING frequency constantly transits from GT to IA level and back. Each transition of a RING frequency is a full system stall. If you will have "good" transition rate with few transitions per few milliseconds you will lose ~10% of performance. That's the case for media workloads when you easily can step into this since 1) media utilizes few GPU engines and with few parallel workloads you can make sure that at least 1 engine is _always_ doing something, 2) media BB are relatively small, so you have regular wakeups of the IA to manage requests. This will affect Gen9 platforms due to HW design change (we've spot this in SKL). This will not happen in Gen8 (old HW design). This will be fixed in Gen10+ (CNL+). On SKL we ran into this with the GPU frequency pinned to 700MHz, CPU to 2GHz. Multipliers were x2 for GT, x1 for IA. So, effectively, what we need to do is to make sure that RING frequency request from GT is _not_ below the request from IA. If IA requests 2GHz, we can't request 1.4GHz, we need request at least 2GHz. Multiplier patch was intended to do exactly that, but manually. Can we somehow automate that managing IA frequency requests to the RING? Dmitry. -----Original Message----- From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] Sent: Tuesday, December 26, 2017 6:36 AM To: Li, Yaodong <yaodong.li@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Gong, Zhipeng <zhipeng.gong@xxxxxxxxx>; Widawsky, Benjamin <benjamin.widawsky@xxxxxxxxx>; Mateo Lozano, Oscar <oscar.mateo@xxxxxxxxx>; Kamble, Sagar A <sagar.a.kamble@xxxxxxxxx>; Rogozhkin, Dmitry V <dmitry.v.rogozhkin@xxxxxxxxx>; Li, Yaodong <yaodong.li@xxxxxxxxx> Subject: Re: [RFC] drm/i915: Add a new modparam for customized ring multiplier Quoting Chris Wilson (2017-12-18 21:47:25) > Quoting Jackie Li (2017-12-18 21:22:08) > > From: Zhipeng Gong <zhipeng.gong@xxxxxxxxx> > > > > SKL platforms requires a higher ring multiplier when there's massive > > GPU load. Current driver doesn't provide a way to override the ring > > multiplier. > > > > This patch adds a new module parameter to allow the overriding of > > ring multiplier for Gen9 platforms. > > So the default ring-scaling is not good enough, the first thing we do > is to try and ensure the defaults work for nearly all use cases. My > impression is that you want a nonlinear scalefactor, low power > workloads don't try and ramp up the ring frequencies as aggressively, > high power workloads try hard for higher frequencies, and then get > throttled back harder as well. How well can we autotune it? What > events tells us if the ratio is too high or too low? One thing that came to mind is that we don't know the min/max ring frequencies and just program them blindly. Is it the case that at max gpu freq, there is still headroom on the ring freq, or do you require a steeper ramp so that you hit the max ringfreq earlier for your workload (which then presumably can run at less than max gpufreq, so pushing power elsewhere). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx