On Thu, 18 Apr 2024, Tvrtko Ursulin <tursulin@xxxxxxxxxxx> wrote: > On 18/04/2024 10:49, Jani Nikula wrote: >> On Wed, 17 Apr 2024, Lucas De Marchi <lucas.demarchi@xxxxxxxxx> wrote: >>> On Mon, Apr 08, 2024 at 03:54:44PM GMT, Jani Nikula wrote: >>>> The raw register reads/writes are there as micro-optimizations to avoid >>>> multiple pointer indirections on uncore->regs. Presumably this is useful >>>> when there are plenty of register reads/writes in the same >>>> function. However, the display irq handling only has a few raw >>>> reads/writes. Remove them for simplification. >>> >>> I think that comment didn't age well. Not to say there's something wrong >>> with this commit, but just to make sure we are aware of the additional >>> stuff going on and we if we are ok with that. >>> >>> using intel_de_read() in place of raw_reg_read() will do (for newer >>> platforms): >>> >>> 1) Read FPGA_DBG to detect unclaimed access before the actual read >>> 2) Find the relevant forcewake for that register, acquire and wait for ack >>> 3) readl(reg) >>> 4) Read FPGA_DBG to detect unclaimed access after the actual read >>> 5) Trace reg rw >>> >>> That's much more than a pointer indirection. Are we ok with that in the >>> irq? Also, I don't know why but we have variants to skip tracing (step >>> 5 above), but on my books a disabled tracepoint is order of magnitudes >>> less overhead than 1, 2 and 4. >> >> Honestly, I don't really know. >> >> The thing is, we have these ad hoc optimizations all over the place. Why >> do we have the raw access in two places, but not everywhere in irq >> handling? The pointer indirection thing really only makes sense if you >> have a lot of access in a function, but that's not the case. You do have >> a point about everything else. > > The "why only two" places is I think simply an artefact of refactoring > and code evolution. Initially all IRQ handling was in one function, then > later gen11 and display parts got split out as more platforms were > added. For example a3265d851e28 ("drm/i915/irq: Refactor gen11 display > interrupt handling"). > > As for the original rationale, it was described in commits like: > > 2e4a5b25886c ("drm/i915: Prune gen8_gt_irq_handler") > c48a798a7447 ("drm/i915: Trim the ironlake+ irq handler") > > Obviosuly, once a portion of a handler was/is extracted, pointer caching > to avoid uncore->regs reloads may not make full sense any more due > function calls potentially overshadowing that cost. > > As for unclaimed debug, I would say it is probably okay to not burden > the irq handlers with it, but if the display folks think a little bit of > extra cost in this sub-handlers is fine that would sound plausible to me > given the frequency of display related interrupts is low. So for me > patch is fine if it makes the display decoupling easier. > >> What would the interface be like if display were its own module? We >> couldn't just wrap it all in a bunch of macros and static inlines. Is >> the end result that display irq handling needs to call functions via >> pointers in another module? Or do we need to move the register level irq >> handling to xe and i915 cores, and handle the display parts at a higher >> abstraction level? > > AFAIR no trace variants were not for performance but to avoid log spam > when debugging stuff. From things like busy/polling loops. Bumping a forgotten topic. Ville, Rodrigo, are we okay with the changes here? BR, Jani. > > Regards, > > Tvrtko >>> >>> btw, if we drop the raw accesses, then we can probably drop (1) above. >>> >>> Lucas De Marchi >>> >>>> >>>> Signed-off-by: Jani Nikula <jani.nikula@xxxxxxxxx> >>>> --- >>>> drivers/gpu/drm/i915/display/intel_display_irq.c | 15 +++++++-------- >>>> 1 file changed, 7 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/display/intel_display_irq.c b/drivers/gpu/drm/i915/display/intel_display_irq.c >>>> index f846c5b108b5..d4ae9139be39 100644 >>>> --- a/drivers/gpu/drm/i915/display/intel_display_irq.c >>>> +++ b/drivers/gpu/drm/i915/display/intel_display_irq.c >>>> @@ -1148,15 +1148,14 @@ void gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl) >>>> >>>> u32 gen11_gu_misc_irq_ack(struct drm_i915_private *i915, const u32 master_ctl) >>>> { >>>> - void __iomem * const regs = intel_uncore_regs(&i915->uncore); >>>> u32 iir; >>>> >>>> if (!(master_ctl & GEN11_GU_MISC_IRQ)) >>>> return 0; >>>> >>>> - iir = raw_reg_read(regs, GEN11_GU_MISC_IIR); >>>> + iir = intel_de_read(i915, GEN11_GU_MISC_IIR); >>>> if (likely(iir)) >>>> - raw_reg_write(regs, GEN11_GU_MISC_IIR, iir); >>>> + intel_de_write(i915, GEN11_GU_MISC_IIR, iir); >>>> >>>> return iir; >>>> } >>>> @@ -1169,18 +1168,18 @@ void gen11_gu_misc_irq_handler(struct drm_i915_private *i915, const u32 iir) >>>> >>>> void gen11_display_irq_handler(struct drm_i915_private *i915) >>>> { >>>> - void __iomem * const regs = intel_uncore_regs(&i915->uncore); >>>> - const u32 disp_ctl = raw_reg_read(regs, GEN11_DISPLAY_INT_CTL); >>>> + u32 disp_ctl; >>>> >>>> disable_rpm_wakeref_asserts(&i915->runtime_pm); >>>> /* >>>> * GEN11_DISPLAY_INT_CTL has same format as GEN8_MASTER_IRQ >>>> * for the display related bits. >>>> */ >>>> - raw_reg_write(regs, GEN11_DISPLAY_INT_CTL, 0x0); >>>> + disp_ctl = intel_de_read(i915, GEN11_DISPLAY_INT_CTL); >>>> + >>>> + intel_de_write(i915, GEN11_DISPLAY_INT_CTL, 0); >>>> gen8_de_irq_handler(i915, disp_ctl); >>>> - raw_reg_write(regs, GEN11_DISPLAY_INT_CTL, >>>> - GEN11_DISPLAY_IRQ_ENABLE); >>>> + intel_de_write(i915, GEN11_DISPLAY_INT_CTL, GEN11_DISPLAY_IRQ_ENABLE); >>>> >>>> enable_rpm_wakeref_asserts(&i915->runtime_pm); >>>> } >>>> -- >>>> 2.39.2 >>>> >> -- Jani Nikula, Intel