Re: [PATCH 0/3] Improve anti-pre-emption w/a for compute workloads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/22/2022 01:53, Tvrtko Ursulin wrote:
On 18/02/2022 21:33, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>

Compute workloads are inherently not pre-emptible on current hardware.
Thus the pre-emption timeout was disabled as a workaround to prevent
unwanted resets. Instead, the hang detection was left to the heartbeat
and its (longer) timeout. This is undesirable with GuC submission as
the heartbeat is a full GT reset rather than a per engine reset and so
is much more destructive. Instead, just bump the pre-emption timeout

Can we have a feature request to allow asking GuC for an engine reset?
For what purpose?

GuC manages the scheduling of contexts across engines. With virtual engines, the KMD has no knowledge of which engine a context might be executing on. Even without virtual engines, the KMD still has no knowledge of which context is currently executing on any given engine at any given time.

There is a reason why hang detection should be left to the entity that is doing the scheduling. Any other entity is second guessing at best.

The reason for keeping the heartbeat around even when GuC submission is enabled is for the case where the KMD/GuC have got out of sync with either other somehow or GuC itself has just crashed. I.e. when no submission at all is working and we need to reset the GuC itself and start over.

John.



Regards,

Tvrtko

to a big value. Also, update the heartbeat to allow such a long
pre-emption delay in the final heartbeat period.

Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx>


John Harrison (3):
   drm/i915/guc: Limit scheduling properties to avoid overflow
   drm/i915/gt: Make the heartbeat play nice with long pre-emption
     timeouts
   drm/i915: Improve long running OCL w/a for GuC submission

  drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 37 +++++++++++++++++--
  .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 16 ++++++++
  drivers/gpu/drm/i915/gt/sysfs_engines.c       | 14 +++++++
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  9 +++++
  4 files changed, 73 insertions(+), 3 deletions(-)





[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux