Hi Andrzej, The patch mentioned below does not help with the issue. Thanks, RK > -----Original Message----- > From: Hajda, Andrzej <andrzej.hajda@xxxxxxxxx> > Sent: Friday, November 3, 2023 2:18 PM > To: Sripada, Radhakrishna <radhakrishna.sripada@xxxxxxxxx>; Tvrtko Ursulin > <tvrtko.ursulin@xxxxxxxxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>; Vivi, Rodrigo > <rodrigo.vivi@xxxxxxxxx> > Subject: Re: [PATCH] drm/i915/mtl: Increase guard pages when vt-d is > enabled > > > > On 03.11.2023 16:53, Sripada, Radhakrishna wrote: > > Hi Tvrtko, > > > >> -----Original Message----- > >> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > >> Sent: Friday, November 3, 2023 1:30 AM > >> To: Sripada, Radhakrishna <radhakrishna.sripada@xxxxxxxxx>; Hajda, Andrzej > >> <andrzej.hajda@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx> > >> Subject: Re: [PATCH] drm/i915/mtl: Increase guard pages when vt-d > is > >> enabled > >> > >> > >> On 02/11/2023 22:14, Sripada, Radhakrishna wrote: > >>> Hi Tvrtko, > >>> > >>>> -----Original Message----- > >>>> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > >>>> Sent: Thursday, November 2, 2023 10:41 AM > >>>> To: Hajda, Andrzej <andrzej.hajda@xxxxxxxxx>; Sripada, Radhakrishna > >>>> <radhakrishna.sripada@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>>> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx> > >>>> Subject: Re: [PATCH] drm/i915/mtl: Increase guard pages when > vt-d > >> is > >>>> enabled > >>>> > >>>> > >>>> On 02/11/2023 16:58, Andrzej Hajda wrote: > >>>>> On 02.11.2023 17:06, Radhakrishna Sripada wrote: > >>>>>> Experiments were conducted with different multipliers to VTD_GUARD > >> macro > >>>>>> with multiplier of 185 we were observing occasional pipe faults when > >>>>>> running kms_cursor_legacy --run-subtest single-bo > >>>>>> > >>>>>> There could possibly be an underlying issue that is being > >>>>>> investigated, for > >>>>>> now bump the guard pages for MTL. > >>>>>> > >>>>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2017 > >>>>>> Cc: Gustavo Sousa <gustavo.sousa@xxxxxxxxx> > >>>>>> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx> > >>>>>> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@xxxxxxxxx> > >>>>>> --- > >>>>>> drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++ > >>>>>> 1 file changed, 3 insertions(+) > >>>>>> > >>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > >>>>>> b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > >>>>>> index 3770828f2eaf..b65f84c6bb3f 100644 > >>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c > >>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c > >>>>>> @@ -456,6 +456,9 @@ i915_gem_object_pin_to_display_plane(struct > >>>>>> drm_i915_gem_object *obj, > >>>>>> if (intel_scanout_needs_vtd_wa(i915)) { > >>>>>> unsigned int guard = VTD_GUARD; > >>>>>> + if (IS_METEORLAKE(i915)) > >>>>>> + guard *= 200; > >>>>>> + > >>>>> 200 * VTD_GUARD = 200 * 168 * 4K = 131MB > >>>>> > >>>>> Looks insanely high, 131MB for padding, if this is before and after it > >>>>> becomes even 262MB of wasted address per plane. Just signalling, I do > >>>>> not know if this actually hurts. > >>>> Yeah this feels crazy. There must be some other explanation which is > >>>> getting hidden by the crazy amount of padding so I'd rather we figured > >>>> it out. > >>>> > >>>> With 262MiB per fb how many fit in GGTT before eviction hits? N screens > >>>> with double/triple buffering? > >>> I believe with this method we will have to limit the no of frame buffers in the > >> system. One alternative > >>> that worked is to do a proper clear range for the ggtt instead of doing a nop. > >> Although it adds marginal > >>> time during suspend/resume/boot it does not add restrictions to the no of > fb's > >> that can be used. > >> > >> And if we remember the guard pages replaced clearing to scratch, to > >> improve suspend resume times, exactly for improving user experience. :( > >> > >> Luckily there is time to fix this properly on MTL one way or the other. > >> Is it just kms_cursor_legacy --run-subtest single-bo that is affected? > > I am trying to dump the page table entries at the time of failure for bot the fame > buffer and if required > > For the guard pages. Will see if I get some info from there. > > > > Yes the test kms_cursor_legacy is used to reliably reproduce. Looking at public > CI, I also see pipe errors > > being reported with varying occurrences while running kms_cursor_crc, > kms_pipe_crc_basic, > > and kms_plane_scaling. More details on the occurrence can be found here [1]. > > > > Thanks, > > RK > > > > 1. http://gfx-ci.igk.intel.com/cibuglog- > ng/results/knownfailures?query_key=d9c3297dd17dda35a6c638eb96b3139bd1a > 6633c > > Could you check if [1] helps? > > [1]: https://patchwork.freedesktop.org/series/125926/ > > Regards > Andrzej > > >> Regards, > >> > >> Tvrtko > >> > >> > >>>> Regards, > >>>> > >>>> Tvrtko > >>>> > >>>> P.S. Where did the 185 from the commit message come from? > >>> 185 came from experiment to increase the guard size. It is not a standard > >> number. > >>> Regards, > >>> Radhakrishna(RK) Sripada