> -----Original Message----- > From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Sent: Saturday, January 09, 2021 7:49 AM > To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>; Mika Kuoppala > <mika.kuoppala@xxxxxxxxxxxxxxx>; Kumar Valsan, Prathap > <prathap.kumar.valsan@xxxxxxxxx>; Abodunrin, Akeem G > <akeem.g.abodunrin@xxxxxxxxx>; Bloomfield, Jon > <jon.bloomfield@xxxxxxxxx>; Vivi, Rodrigo <rodrigo.vivi@xxxxxxxxx>; Randy > Wright <rwright@xxxxxxx>; stable@xxxxxxxxxxxxxxx > Subject: [PATCH 1/3] drm/i915/gt: Limit VFE threads based on GT > > MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the > range [0, n-1] where n is #EU * (#threads/EU) with the number of threads > based on plaform and the number of EU based on the number of slices and > subslices. This is a fixed number per platform/gt, so appropriately limit the > number of threads we spawn to match the device. > > v2: Oversaturate the system with tasks to force execution on every HW > thread; if the thread idles it is returned to the pool and may be reused again > before an unused thread. > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024 > Fixes: 47f8253d2b89 ("drm/i915/gen7: Clear all EU/L3 residual contexts") > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Cc: Prathap Kumar Valsan <prathap.kumar.valsan@xxxxxxxxx> > Cc: Akeem G Abodunrin <akeem.g.abodunrin@xxxxxxxxx> > Cc: Jon Bloomfield <jon.bloomfield@xxxxxxxxx> > Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > Cc: Randy Wright <rwright@xxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # v5.7+ > --- > drivers/gpu/drm/i915/gt/gen7_renderclear.c | 91 ++++++++++++---------- > 1 file changed, 49 insertions(+), 42 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c > b/drivers/gpu/drm/i915/gt/gen7_renderclear.c > index d93d85cd3027..3ea7c9cc0f3d 100644 > --- a/drivers/gpu/drm/i915/gt/gen7_renderclear.c > +++ b/drivers/gpu/drm/i915/gt/gen7_renderclear.c > @@ -7,8 +7,6 @@ > #include "i915_drv.h" > #include "intel_gpu_commands.h" > > -#define MAX_URB_ENTRIES 64 > -#define STATE_SIZE (4 * 1024) > #define GT3_INLINE_DATA_DELAYS 0x1E00 > #define batch_advance(Y, CS) GEM_BUG_ON((Y)->end != (CS)) > > @@ -34,38 +32,57 @@ struct batch_chunk { }; > > struct batch_vals { > - u32 max_primitives; > - u32 max_urb_entries; > - u32 cmd_size; > - u32 state_size; > + u32 max_threads; > u32 state_start; > - u32 batch_size; > + u32 surface_start; > u32 surface_height; > u32 surface_width; > - u32 scratch_size; > - u32 max_size; > + u32 size; > }; > > +static inline int num_primitives(const struct batch_vals *bv) { > + /* > + * We need to oversaturate the GPU with work in order to dispatch > + * a shader on every HW thread. > + */ > + return bv->max_threads + 2; > +} > + > static void > batch_get_defaults(struct drm_i915_private *i915, struct batch_vals *bv) { > if (IS_HASWELL(i915)) { > - bv->max_primitives = 280; > - bv->max_urb_entries = MAX_URB_ENTRIES; > + switch (INTEL_INFO(i915)->gt) { > + default: > + case 1: > + bv->max_threads = 70; > + break; > + case 2: > + bv->max_threads = 140; > + break; > + case 3: > + bv->max_threads = 280; > + break; > + } > bv->surface_height = 16 * 16; > bv->surface_width = 32 * 2 * 16; > } else { > - bv->max_primitives = 128; > - bv->max_urb_entries = MAX_URB_ENTRIES / 2; > + switch (INTEL_INFO(i915)->gt) { > + default: > + case 1: /* including vlv */ > + bv->max_threads = 36; > + break; > + case 2: > + bv->max_threads = 128; > + break; > + } Do we really need to hardcode max number of threads per gt/platform? Why not calculating the number of active threads from the no_of_slices * 1024? - Also, is "64" not the minimum number of threads supported? Thanks, ~Akeem