Quoting Rodrigo Vivi (2021-01-07 19:50:37) > On Fri, Oct 16, 2020 at 06:54:11PM +0100, Chris Wilson wrote: > > MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the > > range [0, n-1] where n is #EU * (#threads/EU) with the number of threads > > based on plaform and the number of EU based on the number of slices and > > subslices. This is a fixed number per platform/gt, so appropriately > > limit the number of threads we spawn to match the device. > > > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024 > > we need to get this closed... Unfortunately this failed the validation test. And as that test is still not in CI, I cannot say why. My vote would be to remove the clear_residuals until it works on all target platforms. Plus we clearly need a hsw-gt1 in CI. > > bv->scratch_size = bv->surface_height * bv->surface_width; > > @@ -244,7 +258,6 @@ gen7_emit_vfe_state(struct batch_chunk *batch, > > u32 urb_size, u32 curbe_size, > > u32 mode) > > { > > - u32 urb_entries = bv->max_urb_entries; > > u32 threads = bv->max_primitives - 1; > > u32 *cs = batch_alloc_items(batch, 32, 8); > > > > @@ -254,7 +267,7 @@ gen7_emit_vfe_state(struct batch_chunk *batch, > > *cs++ = 0; > > > > /* number of threads & urb entries for GPGPU vs Media Mode */ > > - *cs++ = threads << 16 | urb_entries << 8 | mode << 2; > > + *cs++ = threads << 16 | 1 << 8 | mode << 2; > > why urb_entries = 1 ? We only used a single entry. There was no measurable benefit from assigning more entries, and the importance of any side effects from doing so unknown. > the range is 0,64 and 0,128 depending on the sku. > > in general there's a min of 32 URBs Don't forget num_entries * entry_size must fit within the URB allocation/allotment. -Chris