Re: [PATCH] drm/i915/mtl: Increase guard pages when vt-d is enabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tvrtko,

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
> Sent: Friday, November 3, 2023 1:30 AM
> To: Sripada, Radhakrishna <radhakrishna.sripada@xxxxxxxxx>; Hajda, Andrzej
> <andrzej.hajda@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>
> Subject: Re:  [PATCH] drm/i915/mtl: Increase guard pages when vt-d is
> enabled
> 
> 
> On 02/11/2023 22:14, Sripada, Radhakrishna wrote:
> > Hi Tvrtko,
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
> >> Sent: Thursday, November 2, 2023 10:41 AM
> >> To: Hajda, Andrzej <andrzej.hajda@xxxxxxxxx>; Sripada, Radhakrishna
> >> <radhakrishna.sripada@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>
> >> Subject: Re:  [PATCH] drm/i915/mtl: Increase guard pages when vt-d
> is
> >> enabled
> >>
> >>
> >> On 02/11/2023 16:58, Andrzej Hajda wrote:
> >>> On 02.11.2023 17:06, Radhakrishna Sripada wrote:
> >>>> Experiments were conducted with different multipliers to VTD_GUARD
> macro
> >>>> with multiplier of 185 we were observing occasional pipe faults when
> >>>> running kms_cursor_legacy --run-subtest single-bo
> >>>>
> >>>> There could possibly be an underlying issue that is being
> >>>> investigated, for
> >>>> now bump the guard pages for MTL.
> >>>>
> >>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2017
> >>>> Cc: Gustavo Sousa <gustavo.sousa@xxxxxxxxx>
> >>>> Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>
> >>>> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@xxxxxxxxx>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
> >>>>    1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> index 3770828f2eaf..b65f84c6bb3f 100644
> >>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> >>>> @@ -456,6 +456,9 @@ i915_gem_object_pin_to_display_plane(struct
> >>>> drm_i915_gem_object *obj,
> >>>>        if (intel_scanout_needs_vtd_wa(i915)) {
> >>>>            unsigned int guard = VTD_GUARD;
> >>>> +        if (IS_METEORLAKE(i915))
> >>>> +            guard *= 200;
> >>>> +
> >>>
> >>> 200 * VTD_GUARD = 200 * 168 * 4K = 131MB
> >>>
> >>> Looks insanely high, 131MB for padding, if this is before and after it
> >>> becomes even 262MB of wasted address per plane. Just signalling, I do
> >>> not know if this actually hurts.
> >>
> >> Yeah this feels crazy. There must be some other explanation which is
> >> getting hidden by the crazy amount of padding so I'd rather we figured
> >> it out.
> >>
> >> With 262MiB per fb how many fit in GGTT before eviction hits? N screens
> >> with double/triple buffering?
> >
> > I believe with this method we will have to limit the no of frame buffers in the
> system. One alternative
> > that worked is to do a proper clear range for the ggtt instead of doing a nop.
> Although it adds marginal
> > time during suspend/resume/boot it does not add restrictions to the no of fb's
> that can be used.
> 
> And if we remember the guard pages replaced clearing to scratch, to
> improve suspend resume times, exactly for improving user experience. :(
> 
> Luckily there is time to fix this properly on MTL one way or the other.
> Is it just kms_cursor_legacy --run-subtest single-bo that is affected?

I am trying to dump the page table entries at the time of failure for bot the fame buffer and if required
For the guard pages. Will see if I get some info from there.

Yes the test kms_cursor_legacy is used to reliably reproduce. Looking at public CI, I also see pipe errors
being reported with varying occurrences while running kms_cursor_crc, kms_pipe_crc_basic,
and kms_plane_scaling. More details on the occurrence can be found here [1].

Thanks,
RK

1. http://gfx-ci.igk.intel.com/cibuglog-ng/results/knownfailures?query_key=d9c3297dd17dda35a6c638eb96b3139bd1a6633c

> 
> Regards,
> 
> Tvrtko
> 
> 
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >> P.S. Where did the 185 from the commit message come from?
> > 185 came from experiment to increase the guard size. It is not a standard
> number.
> >
> > Regards,
> > Radhakrishna(RK) Sripada




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux