Re: [PATCH] drm/i915/gt: Restart the heartbeat timer when forcing a pulse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/31/2024 10:48, Janusz Krzysztofik wrote:
Hi John,

On Wednesday, 10 January 2024 22:02:16 CET John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>

The context persistence code does things like send super high priority
heartbeat pulses to ensure any leaked context can still be pre-empted
and thus isn't a total denial of service but only a minor denial of
service. Unfortunately, it wasn't bothering to restart the heatbeat
worker with a fresh timeout. Thus, if a persistent context happened to
be closed just before the heartbeat was going to go ping anyway then
the forced pulse would get a negligble execution time. And as the
forced pulse is super high priority, the worker thread's next step is
a reset. Which means a potentially innocent system randomly goes boom
when attempting to close a context. So, force a re-schedule of the
worker thread with the appropriate timeout.
I haven't looked too much in heartbeat pulses code before, but I think I can
understand your change.  I've also got a positive opinion from Chris on it.
I can provide my Ack, assuming the pre-merge failure reported by CI is not
related, but could you please comment that failure first and/or ask BUG Filing
to handle it so we have it cleaned up?
Pretty confident the CI failure is unrelated. Not seeing how a change to the heartbeat timing of persistence context clean up could cause a HDMI test to fail to complete.

However, I was really hoping for a full code review by someone who understands this code and would be able to say whether there could be unexpected side effects of the change. Or even if there is a better / safer way to fix the problem.

@Andi Shyti, you were fingered as being someone who might have such knowledge. Can you comment?

Thanks,
John.

Thanks,
Janusz


Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx>
---
  drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 1a8e2b7db0138..4ae2fa0b61dd4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -290,6 +290,9 @@ static int __intel_engine_pulse(struct intel_engine_cs *engine)
  	heartbeat_commit(rq, &attr);
  	GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER);
+ /* Ensure the forced pulse gets a full period to execute */
+	next_heartbeat(engine);
+
  	return 0;
  }







[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux