On Tue, Sep 17, 2024 at 01:58:19PM +0300, Jani Nikula wrote: > On Thu, 18 Apr 2024, Tvrtko Ursulin <tursulin@xxxxxxxxxxx> wrote: > > On 18/04/2024 10:49, Jani Nikula wrote: > >> On Wed, 17 Apr 2024, Lucas De Marchi <lucas.demarchi@xxxxxxxxx> wrote: > >>> On Mon, Apr 08, 2024 at 03:54:44PM GMT, Jani Nikula wrote: > >>>> The raw register reads/writes are there as micro-optimizations to avoid > >>>> multiple pointer indirections on uncore->regs. Presumably this is useful > >>>> when there are plenty of register reads/writes in the same > >>>> function. However, the display irq handling only has a few raw > >>>> reads/writes. Remove them for simplification. > >>> > >>> I think that comment didn't age well. Not to say there's something wrong > >>> with this commit, but just to make sure we are aware of the additional > >>> stuff going on and we if we are ok with that. > >>> > >>> using intel_de_read() in place of raw_reg_read() will do (for newer > >>> platforms): > >>> > >>> 1) Read FPGA_DBG to detect unclaimed access before the actual read > >>> 2) Find the relevant forcewake for that register, acquire and wait for ack > >>> 3) readl(reg) > >>> 4) Read FPGA_DBG to detect unclaimed access after the actual read > >>> 5) Trace reg rw > >>> > >>> That's much more than a pointer indirection. Are we ok with that in the > >>> irq? Also, I don't know why but we have variants to skip tracing (step > >>> 5 above), but on my books a disabled tracepoint is order of magnitudes > >>> less overhead than 1, 2 and 4. > >> > >> Honestly, I don't really know. > >> > >> The thing is, we have these ad hoc optimizations all over the place. Why > >> do we have the raw access in two places, but not everywhere in irq > >> handling? The pointer indirection thing really only makes sense if you > >> have a lot of access in a function, but that's not the case. You do have > >> a point about everything else. > > > > The "why only two" places is I think simply an artefact of refactoring > > and code evolution. Initially all IRQ handling was in one function, then > > later gen11 and display parts got split out as more platforms were > > added. For example a3265d851e28 ("drm/i915/irq: Refactor gen11 display > > interrupt handling"). > > > > As for the original rationale, it was described in commits like: > > > > 2e4a5b25886c ("drm/i915: Prune gen8_gt_irq_handler") Looking at this one it looks that the raw usage came in place to fix a macro issue, that we don't have anymore anyway. > > c48a798a7447 ("drm/i915: Trim the ironlake+ irq handler") Then, looking at this one, it sounds a good optimization. > > > > Obviosuly, once a portion of a handler was/is extracted, pointer caching > > to avoid uncore->regs reloads may not make full sense any more due > > function calls potentially overshadowing that cost. > > > > As for unclaimed debug, I would say it is probably okay to not burden > > the irq handlers with it, but if the display folks think a little bit of > > extra cost in this sub-handlers is fine that would sound plausible to me > > given the frequency of display related interrupts is low. Well, looking at the optimization above I always had the initial thought on the low frequency of display interrupts, because I thought about hotplugs. But perhaps an optimization in vblank ones would be desireable? > > So for me > > patch is fine if it makes the display decoupling easier. > > > >> What would the interface be like if display were its own module? We > >> couldn't just wrap it all in a bunch of macros and static inlines. Is > >> the end result that display irq handling needs to call functions via > >> pointers in another module? Or do we need to move the register level irq > >> handling to xe and i915 cores, and handle the display parts at a higher > >> abstraction level? > > > > AFAIR no trace variants were not for performance but to avoid log spam > > when debugging stuff. From things like busy/polling loops. > > Bumping a forgotten topic. > > Ville, Rodrigo, are we okay with the changes here? I am in favor of this patch. Let's unify things. But perhaps study if we need as a follow-up some optimization in vblank or any other display irq and get that done inside intel_de_ mmio helpers?! > > BR, > Jani. > > > > > Regards, > > > > Tvrtko > >>> > >>> btw, if we drop the raw accesses, then we can probably drop (1) above. > >>> > >>> Lucas De Marchi > >>> > >>>> > >>>> Signed-off-by: Jani Nikula <jani.nikula@xxxxxxxxx> > >>>> --- > >>>> drivers/gpu/drm/i915/display/intel_display_irq.c | 15 +++++++-------- > >>>> 1 file changed, 7 insertions(+), 8 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/display/intel_display_irq.c b/drivers/gpu/drm/i915/display/intel_display_irq.c > >>>> index f846c5b108b5..d4ae9139be39 100644 > >>>> --- a/drivers/gpu/drm/i915/display/intel_display_irq.c > >>>> +++ b/drivers/gpu/drm/i915/display/intel_display_irq.c > >>>> @@ -1148,15 +1148,14 @@ void gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl) > >>>> > >>>> u32 gen11_gu_misc_irq_ack(struct drm_i915_private *i915, const u32 master_ctl) > >>>> { > >>>> - void __iomem * const regs = intel_uncore_regs(&i915->uncore); > >>>> u32 iir; > >>>> > >>>> if (!(master_ctl & GEN11_GU_MISC_IRQ)) > >>>> return 0; > >>>> > >>>> - iir = raw_reg_read(regs, GEN11_GU_MISC_IIR); > >>>> + iir = intel_de_read(i915, GEN11_GU_MISC_IIR); > >>>> if (likely(iir)) > >>>> - raw_reg_write(regs, GEN11_GU_MISC_IIR, iir); > >>>> + intel_de_write(i915, GEN11_GU_MISC_IIR, iir); > >>>> > >>>> return iir; > >>>> } > >>>> @@ -1169,18 +1168,18 @@ void gen11_gu_misc_irq_handler(struct drm_i915_private *i915, const u32 iir) > >>>> > >>>> void gen11_display_irq_handler(struct drm_i915_private *i915) > >>>> { > >>>> - void __iomem * const regs = intel_uncore_regs(&i915->uncore); > >>>> - const u32 disp_ctl = raw_reg_read(regs, GEN11_DISPLAY_INT_CTL); > >>>> + u32 disp_ctl; > >>>> > >>>> disable_rpm_wakeref_asserts(&i915->runtime_pm); > >>>> /* > >>>> * GEN11_DISPLAY_INT_CTL has same format as GEN8_MASTER_IRQ > >>>> * for the display related bits. > >>>> */ > >>>> - raw_reg_write(regs, GEN11_DISPLAY_INT_CTL, 0x0); > >>>> + disp_ctl = intel_de_read(i915, GEN11_DISPLAY_INT_CTL); > >>>> + > >>>> + intel_de_write(i915, GEN11_DISPLAY_INT_CTL, 0); > >>>> gen8_de_irq_handler(i915, disp_ctl); > >>>> - raw_reg_write(regs, GEN11_DISPLAY_INT_CTL, > >>>> - GEN11_DISPLAY_IRQ_ENABLE); > >>>> + intel_de_write(i915, GEN11_DISPLAY_INT_CTL, GEN11_DISPLAY_IRQ_ENABLE); > >>>> > >>>> enable_rpm_wakeref_asserts(&i915->runtime_pm); > >>>> } > >>>> -- > >>>> 2.39.2 > >>>> > >> > > -- > Jani Nikula, Intel