Hi Ville, +Nirmoy > -----Original Message----- > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > Sent: Wednesday, December 13, 2023 1:30 AM > To: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Sripada, Radhakrishna > <radhakrishna.sripada@xxxxxxxxx> > Subject: Re: [PATCH 04/12] drm/i915: Bypass LMEMBAR/GTTMMADR for MTL > stolen memory access > > On Wed, Dec 13, 2023 at 11:09:38AM +0200, Joonas Lahtinen wrote: > > Quoting Ville Syrjala (2023-12-13 02:42:29) > > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > On MTL accessing stolen memory via the BARs is somehow borked, > > > and it can hang the machine. As a workaround let's bypass the > > > BARs and just go straight to DSMBASE/GSMBASE instead. > > > > > > Note that on every other platform this itself would hang the > > > machine, but on MTL the system firmware is expected to relax > > > the access permission guarding stolen memory to enable this > > > workaround, and thus direct CPU accesses should be fine. > > > > Shouldn't this have a proper workaround number assigned? > > I think there are various numbers, half of which I couldn't even > find in any database, and the other half I couldn't access for > whatever reason. So dunno what situation really is apart from > the firmware clearly implemening its part (since I can poke > DSM/GSM directly without killing the machine). > > RK do you know what we should call this? Nirmoy previously used Wa_22018444074 in [1]. There were some associated issues Wa_13010847436 and Wa_14019519902 which Nirmoy quoted in [2]. Wa_22018529850 is derived from Wa_22018444074, is targeting the latest Gop driver change which is installed in bat-mtlp-8 hopefully it should help debug the issue further. Regarding the patch itself, Do we need to carve out the range from e820 the area around DSM if we can directly access the range from CPU without the bar? Thanks, Radhakrishna(RK) Sripada 1. https://patchwork.freedesktop.org/series/120683/ 2. https://patchwork.freedesktop.org/series/123329/ > > > > > Regards, Joonas > > > > > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 11 ++++++++++- > > > drivers/gpu/drm/i915/gt/intel_ggtt.c | 13 ++++++++++++- > > > 2 files changed, 22 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c > b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c > > > index ee237043c302..252fe5cd6ede 100644 > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c > > > @@ -941,7 +941,16 @@ i915_gem_stolen_lmem_setup(struct > drm_i915_private *i915, u16 type, > > > dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M); > > > } > > > > > > - if (pci_resource_len(pdev, GEN12_LMEM_BAR) < lmem_size) { > > > + if (IS_METEORLAKE(i915)) { > > > + /* > > > + * Workaround: access via BAR can hang MTL, go directly to DSM. > > > + * > > > + * Normally this would not work but on MTL the system firmware > > > + * should have relaxed the access permissions sufficiently. > > > + */ > > > + io_start = intel_uncore_read64(uncore, GEN12_DSMBASE) & > GEN12_BDSM_MASK; > > > + io_size = dsm_size; > > > + } else if (pci_resource_len(pdev, GEN12_LMEM_BAR) < lmem_size) { > > > io_start = 0; > > > io_size = 0; > > > } else { > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c > b/drivers/gpu/drm/i915/gt/intel_ggtt.c > > > index 21a7e3191c18..ab71d74ec426 100644 > > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c > > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c > > > @@ -24,6 +24,7 @@ > > > #include "intel_ring.h" > > > #include "i915_drv.h" > > > #include "i915_pci.h" > > > +#include "i915_reg.h" > > > #include "i915_request.h" > > > #include "i915_scatterlist.h" > > > #include "i915_utils.h" > > > @@ -1152,13 +1153,23 @@ static unsigned int gen6_gttadr_offset(struct > drm_i915_private *i915) > > > static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size) > > > { > > > struct drm_i915_private *i915 = ggtt->vm.i915; > > > + struct intel_uncore *uncore = ggtt->vm.gt->uncore; > > > struct pci_dev *pdev = to_pci_dev(i915->drm.dev); > > > phys_addr_t phys_addr; > > > u32 pte_flags; > > > int ret; > > > > > > GEM_WARN_ON(pci_resource_len(pdev, GEN4_GTTMMADR_BAR) != > gen6_gttmmadr_size(i915)); > > > - phys_addr = pci_resource_start(pdev, GEN4_GTTMMADR_BAR) + > gen6_gttadr_offset(i915); > > > + /* > > > + * Workaround: access via BAR can hang MTL, go directly to GSM. > > > + * > > > + * Normally this would not work but on MTL the system firmware > > > + * should have relaxed the access permissions sufficiently. > > > + */ > > > + if (IS_METEORLAKE(i915)) > > > + phys_addr = intel_uncore_read64(uncore, GEN12_GSMBASE) & > GEN12_BDSM_MASK; > > > + else > > > + phys_addr = pci_resource_start(pdev, GEN4_GTTMMADR_BAR) + > gen6_gttadr_offset(i915); > > > > > > if (needs_wc_ggtt_mapping(i915)) > > > ggtt->gsm = ioremap_wc(phys_addr, size); > > > -- > > > 2.41.0 > > > > > -- > Ville Syrjälä > Intel