On Fri, Sep 10, 2021 at 04:03:50PM +0100, Tvrtko Ursulin wrote: > > On 10/09/2021 15:24, Matt Roper wrote: > > On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote: > > > > > > On 10/09/2021 06:33, Matt Roper wrote: > > > > Our uncore MMIO functions for reading/writing registers have become very > > > > complicated over time. There's significant macro magic used to generate > > > > several nearly-identical functions that only really differ in terms of > > > > which platform-specific shadow register table they should check on write > > > > operations. We can significantly simplify our MMIO handlers by storing > > > > a reference to the current platform's shadow table within the 'struct > > > > intel_uncore' the same way we already do for forcewake; this allows us > > > > to consolidate the multiple variants of each 'write' function down to > > > > just a single 'fwtable' version that gets the shadow table out of the > > > > uncore struct rather than hardcoding the name of a specific platform's > > > > table. We can do similar consolidation on the MMIO read side by > > > > creating a single-entry forcewake table to replace the open-coded range > > > > check they had been using previously. > > > > > > > > The final patch of the series adds a new shadow table for DG2; this > > > > becomes quite clean and simple now, given the refactoring in the first > > > > five patches. > > > > > > Tidy and it ends up saving kernel binary size. > > > > > > However I am undecided yet, because one thing to note is that the trade off > > > is source code and kernel text consolidation at the expense of more indirect > > > calls at runtime and larger common read/write functions. > > > > > > To expand, current code generates a bunch of per gen functions but in doing > > > so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH > > > (from find_fw_domain) so at runtime each platform mmio read/write does not > > > have to do indirect calls to do lookups. > > > > > > It may matter a lot in the grand scheme of things but this trade off is > > > something to note in the cover letter I think. > > > > That's true. However it seems like if the extra indirect calls are good > > enough for our forcewake lookups (which are called more frequently and > > have to search through much larger tables) then using the same strategy > > for shadow registers should be less of a concern. Plus most of > > timing-critical parts of the code don't call through this at all; they > > just grab an explicit forcewake and then issue a bunch of *_fw() > > operations that skip all the per-register forcewake and shadow handling. > > With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have a > good idea of how many of those followed by "_fw" accessors we have vs > "un-optimized" access. But it's a good point. > > I was mostly coming from the point of view of old platforms like gen6, where > with this series reads go from inlined checks (NEEDS_FORCE_WAKE) to always > calling find_fw_domain. Just because it is a bit unfortunate to burden old > CPUs (they are not getting any faster) with executing more code. It's not > nice when old hardware gets slower and slower with software updates. :) But > whether or not this case would at all be measurable.. probably not. Unless > some compounding effects, like "death by thousand cuts", would come into > play. Chris pointed out in an offline mail that NEEDS_FORCE_WAKE does cut cut out a lot of display MMIO lookups. So I think it might be worth adding that back, but also adding an "|| GEN11_BSD_RING_BASE" so that it will still be accurate for the newer platforms too. But I think another thing to consider here would be that we might want to switch our intel_de_{read,write} wrappers to call raw mmio directly, to completely bypass forcewake and shadow logic. Matt > > Regards, > > Tvrtko > > > But you're right that this is something I should mention more clearly in > > the cover letter. > > > > > > Matt > > > > > > > > Regards, > > > > > > Tvrtko > > > > > > > Matt Roper (6): > > > > drm/i915/uncore: Convert gen6/gen7 read operations to fwtable > > > > drm/i915/uncore: Associate shadow table with uncore > > > > drm/i915/uncore: Replace gen8 write functions with general fwtable > > > > drm/i915/uncore: Drop gen11/gen12 mmio write handlers > > > > drm/i915/uncore: Drop gen11 mmio read handlers > > > > drm/i915/dg2: Add DG2-specific shadow register table > > > > > > > > drivers/gpu/drm/i915/intel_uncore.c | 190 ++++++++++-------- > > > > drivers/gpu/drm/i915/intel_uncore.h | 7 + > > > > drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + > > > > 3 files changed, 110 insertions(+), 88 deletions(-) > > > > > > -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795