Woops, this should have gone to the Xe list, not the i915 list. I'll re-send. Matt On Mon, Aug 12, 2024 at 11:07:37AM -0700, Matt Roper wrote: > Early in the development of Xe we identified an issue with SVG state > handling on DG2 and MTL (and later on Xe2 as well). In > commit 72ac304769dd ("drm/xe: Emit SVG state on RCS during driver load > on DG2 and MTL") and commit fb24b858a20d ("drm/xe/xe2: Update SVG state > handling") we implemented our own workaround to prevent SVG state from > leaking from context A to context B in cases where context B never > issues a specific state setting. > > The hardware teams have now created official workaround Wa_14019789679 > to cover this issue. The workaround description only requires emitting > 3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction > that would potentially remain unset by a context B, but still cause > notable issues if unwanted values were inherited from context A. > However since we already have a more extensive implementation that emits > the entire SVG state and prevents _any_ SVG state from unintentionally > leaking, we'll stick with our existing implementation just to be safe. > > Signed-off-by: Matt Roper <matthew.d.roper@xxxxxxxxx> > --- > drivers/gpu/drm/xe/xe_lrc.c | 35 +++++++++++++++++++++--------- > drivers/gpu/drm/xe/xe_wa_oob.rules | 2 ++ > 2 files changed, 27 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c > index 58121821f081..974a9cd8c379 100644 > --- a/drivers/gpu/drm/xe/xe_lrc.c > +++ b/drivers/gpu/drm/xe/xe_lrc.c > @@ -5,6 +5,8 @@ > > #include "xe_lrc.h" > > +#include <generated/xe_wa_oob.h> > + > #include <linux/ascii85.h> > > #include "instructions/xe_mi_commands.h" > @@ -24,6 +26,7 @@ > #include "xe_memirq.h" > #include "xe_sriov.h" > #include "xe_vm.h" > +#include "xe_wa.h" > > #define LRC_VALID BIT_ULL(0) > #define LRC_PRIVILEGE BIT_ULL(8) > @@ -1581,19 +1584,31 @@ void xe_lrc_emit_hwe_state_instructions(struct xe_exec_queue *q, struct xe_bb *b > int state_table_size = 0; > > /* > - * At the moment we only need to emit non-register state for the RCS > - * engine. > + * Wa_14019789679 > + * > + * If the driver doesn't explicitly emit the SVG instructions while > + * setting up the default LRC, the context switch will write 0's > + * (noops) into the LRC memory rather than the expected instruction > + * headers. Application contexts start out as a copy of the default > + * LRC, and if they also do not emit specific settings for some SVG > + * state, then on context restore they'll unintentionally inherit > + * whatever state setting the previous context had programmed into the > + * hardware (i.e., the lack of a 3DSTATE_* instruction in the LRC will > + * prevent the hardware from resetting that state back to any specific > + * value). > + * > + * The official workaround only requires emitting 3DSTATE_MESH_CONTROL > + * since that's a specific state setting that can easily cause GPU > + * hangs if unintentionally inherited. However to be safe we'll > + * continue to emit all of the SVG state since it's best not to leak > + * any of the state between contexts, even if that leakage is harmless. > */ > - if (q->hwe->class != XE_ENGINE_CLASS_RENDER) > - return; > - > - switch (GRAPHICS_VERx100(xe)) { > - case 1255: > - case 1270 ... 2004: > + if (XE_WA(gt, 14019789679) && q->hwe->class == XE_ENGINE_CLASS_RENDER) { > state_table = xe_hpg_svg_state; > state_table_size = ARRAY_SIZE(xe_hpg_svg_state); > - break; > - default: > + } > + > + if (!state_table) { > xe_gt_dbg(gt, "No non-register state to emit on graphics ver %d.%02d\n", > GRAPHICS_VER(xe), GRAPHICS_VERx100(xe) % 100); > return; > diff --git a/drivers/gpu/drm/xe/xe_wa_oob.rules b/drivers/gpu/drm/xe/xe_wa_oob.rules > index 5cf27ff27ce6..920ca5060146 100644 > --- a/drivers/gpu/drm/xe/xe_wa_oob.rules > +++ b/drivers/gpu/drm/xe/xe_wa_oob.rules > @@ -35,3 +35,5 @@ > GRAPHICS_VERSION(2001) > 22019338487_display PLATFORM(LUNARLAKE) > 16023588340 GRAPHICS_VERSION(2001) > +14019789679 GRAPHICS_VERSION(1255) > + GRAPHICS_VERSION_RANGE(1270, 2004) > -- > 2.45.2 > -- Matt Roper Graphics Software Engineer Linux GPU Platform Enablement Intel Corporation