On 18/02/2022 21:33, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx> Compute workloads are inherently not pre-emptible on current hardware. Thus the pre-emption timeout was disabled as a workaround to prevent unwanted resets. Instead, the hang detection was left to the heartbeat and its (longer) timeout. This is undesirable with GuC submission as the heartbeat is a full GT reset rather than a per engine reset and so is much more destructive. Instead, just bump the pre-emption timeout
Can we have a feature request to allow asking GuC for an engine reset? Regards, Tvrtko
to a big value. Also, update the heartbeat to allow such a long pre-emption delay in the final heartbeat period. Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx> John Harrison (3): drm/i915/guc: Limit scheduling properties to avoid overflow drm/i915/gt: Make the heartbeat play nice with long pre-emption timeouts drm/i915: Improve long running OCL w/a for GuC submission drivers/gpu/drm/i915/gt/intel_engine_cs.c | 37 +++++++++++++++++-- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 16 ++++++++ drivers/gpu/drm/i915/gt/sysfs_engines.c | 14 +++++++ drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 9 +++++ 4 files changed, 73 insertions(+), 3 deletions(-)