On 1/23/2025 8:23 AM, Zhanjun Dong wrote:
The purpose of synchronize_irq is to wait for any pending IRQ handlers for the
interrupt to complete, if synchronize_irq called before interrupt disabled, an
tiny timing window created, where no more pending IRQ, but interrupt not
disabled yet. Meanwhile, if the interrupt event happened in this timing window,
an unexpected IRQ handling will be triggered.
Fixed by always disable interrupt ahead of synchronize_irq.
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13454
Fixes: 26705e20752a ("drm/i915: Support for GuC interrupts")
Fixes: 54c52a841250 ("drm/i915/guc: Correctly handle GuC interrupts on Gen11")
Fixes: 2ae096872a2c ("drm/i915/pxp: Implement PXP irq handler")
Fixes: 3e7abf814193 ("drm/i915: Extract GT render power state management")
Signed-off-by: Zhanjun Dong <zhanjun.dong@xxxxxxxxx>
---
Cc: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx>
Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
Cc: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx>
Cc: Andi Shyti <andi.shyti@xxxxxxxxx>
---
drivers/gpu/drm/i915/gt/intel_rps.c | 3 +--
drivers/gpu/drm/i915/gt/uc/intel_guc.c | 4 ++--
drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 2 +-
3 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index fa304ea088e4..0fe7a8d7f460 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -244,8 +244,8 @@ static void rps_disable_interrupts(struct intel_rps *rps)
gen6_gt_pm_disable_irq(gt, GEN6_PM_RPS_EVENTS);
spin_unlock_irq(gt->irq_lock);
+ rps_reset_interrupts(rps);
intel_synchronize_irq(gt->i915);
I don't think this is an issue, because we set the irq mask in
gen6_gt_pm_disable_irq, so there is no chance of getting any new
interrupts after that. Not saying that we shouldn't do the re-order, but
we don't need a fixes tag for this.
-
/*
* Now that we will not be generating any more work, flush any
* outstanding tasks. As we are called on the RPS idle path,
@@ -254,7 +254,6 @@ static void rps_disable_interrupts(struct intel_rps *rps)
*/
cancel_work_sync(&rps->work);
- rps_reset_interrupts(rps);
GT_TRACE(gt, "interrupts:off\n");
}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 5949ff0b0161..3e7b2c6cdca4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -116,9 +116,9 @@ static void gen9_disable_guc_interrupts(struct intel_guc *guc)
gen6_gt_pm_disable_irq(gt, gt->pm_guc_events);
spin_unlock_irq(gt->irq_lock);
- intel_synchronize_irq(gt->i915);
gen9_reset_guc_interrupts(guc);
+ intel_synchronize_irq(gt->i915);
Same as above with gen6_gt_pm_disable_irq
}
static bool __gen11_reset_guc_interrupts(struct intel_gt *gt)
@@ -154,9 +154,9 @@ static void gen11_disable_guc_interrupts(struct intel_guc *guc)
struct intel_gt *gt = guc_to_gt(guc);
guc->interrupts.enabled = false;
- intel_synchronize_irq(gt->i915);
gen11_reset_guc_interrupts(guc);
+ intel_synchronize_irq(gt->i915);
No early disabling here, but I don't think this change helps either.
AFAICS gen11_reset_guc_interrupts() only calls gen11_gt_reset_one_iir(),
which just clears the IIR bits for the GuC. There are no changes in
interrupt enable/mask status, so interrupts can still be generated. The
way interrupts are stopped for gen11+ is by setting
guc->interrupts.enabled, because that's checked from both
guc_irq_handler() and intel_guc_to_host_event_handler(), so any new
interrupts generated after we set that should be immediately dropped.
}
static void guc_dead_worker_func(struct work_struct *w)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
index d81750b9bdda..b82a667e7ac0 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
@@ -101,9 +101,9 @@ void intel_pxp_irq_disable(struct intel_pxp *pxp)
__pxp_set_interrupts(gt, 0);
spin_unlock_irq(gt->irq_lock);
- intel_synchronize_irq(gt->i915);
pxp_irq_reset(gt);
+ intel_synchronize_irq(gt->i915);
Again not a bug here, __pxp_set_interrupts is doing the interrupts
disabling and that is already happening before intel_synchronize_irq().
Daniele
flush_work(&pxp->session_work);
}