On 8/18/2023 6:29 PM, Rodrigo Vivi wrote:
On Fri, Aug 18, 2023 at 11:32:27AM +0530, Sundaresan, Sujaritha wrote:
On 8/18/2023 11:30 AM, Gupta, Anshuman wrote:
-----Original Message-----
From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of
Sujaritha Sundaresan
Sent: Friday, August 18, 2023 8:16 AM
To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: [PATCH] drm/i915: Add intel_pcode_probe
Added intel_pcode_probe, promoted wait for lmem init and intel_pcode_init
prior to mmio_probe during load, so that GT registers can be accessed only
after this, else MCA is observed.
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@xxxxxxxxx>
Both DG1 and DG2 crashed during i915_pci_probe.
BAT is failing.
Thanks,
Anshuman Gupta.
Hi Anshuman,
Yes I'm currently looking into it.
Thanks,
Suja
---
drivers/gpu/drm/i915/i915_driver.c | 37 ++++++++++++++++++++++++-----
drivers/gpu/drm/i915/intel_uncore.c | 12 ----------
2 files changed, 31 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_driver.c
b/drivers/gpu/drm/i915/i915_driver.c
index f8dbee7a5af7..92cafceaf447 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -93,6 +93,7 @@
#include "i915_memcpy.h"
#include "i915_perf.h"
#include "i915_query.h"
+#include "i915_reg.h"
#include "i915_suspend.h"
#include "i915_switcheroo.h"
#include "i915_sysfs.h"
@@ -436,6 +437,32 @@ static int i915_pcode_init(struct drm_i915_private
*i915)
return 0;
}
+static int intel_pcode_probe(struct drm_i915_private *i915) {
+ struct intel_uncore *uncore;
+ int ret;
+
+ /*
+ * The boot firmware initializes local memory and assesses its health.
+ * If memory training fails, the punit will have been instructed to
+ * keep the GT powered down; we won't be able to communicate
with it
+ * and we should not continue with driver initialization.
+ */
+ if (IS_DGFX(i915) &&
+ !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT))
{
+ drm_err(&i915->drm, "LMEM not initialized by firmware\n");
+ return -ENODEV;
+ }
+
+ /*
+ * Driver handshakes with pcode via mailbox command to know that
SoC
+ * initialization is complete before proceeding further
+ */
+ ret = i915_pcode_init(i915);
+
+ return ret;
+}
+
/**
* i915_driver_hw_probe - setup state requiring device access
* @dev_priv: device private
@@ -547,10 +574,6 @@ static int i915_driver_hw_probe(struct
drm_i915_private *dev_priv)
intel_opregion_setup(dev_priv);
- ret = i915_pcode_init(dev_priv);
- if (ret)
- goto err_opregion;
-
/*
* Fill the dram structure to get the system dram info. This will be
* used for memory latency calculation.
@@ -561,8 +584,6 @@ static int i915_driver_hw_probe(struct
drm_i915_private *dev_priv)
return 0;
-err_opregion:
- intel_opregion_cleanup(dev_priv);
err_msi:
if (pdev->msi_enabled)
pci_disable_msi(pdev);
@@ -778,6 +799,10 @@ int i915_driver_probe(struct pci_dev *pdev, const
struct pci_device_id *ent)
if (ret < 0)
goto out_runtime_pm_put;
+ ret = intel_pcode_probe(i915);
+ if (ret)
+ goto out_tiles_cleanup;
+
ret = i915_driver_mmio_probe(i915);
chicken-egg problem here?!
I don't believe this could ever work. You need the MMIO space to be able
to communicate with PCODE mailbox and check the lmem init, no?!
I believe the bug is that PCODE check should come before the LMEM_INIT
check.
LMEM won't be ready before pcode state that everything was ready for
the lmem access. And on your code pcode ready check is still after
the lmem.
Cc: Aravind Iddamsetty <aravind.iddamsetty@xxxxxxxxx>
who was recently raising that we had an order problem there.
Yes looks like there is definitely an ordering issue.
Will look into this more.
if (ret < 0)
goto out_tiles_cleanup;
diff --git a/drivers/gpu/drm/i915/intel_uncore.c
b/drivers/gpu/drm/i915/intel_uncore.c
index dfefad5a5fec..4a353d4adf86 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2658,18 +2658,6 @@ int intel_uncore_init_mmio(struct intel_uncore
*uncore)
if (ret)
return ret;
- /*
- * The boot firmware initializes local memory and assesses its health.
- * If memory training fails, the punit will have been instructed to
- * keep the GT powered down; we won't be able to communicate
with it
- * and we should not continue with driver initialization.
- */
- if (IS_DGFX(i915) &&
- !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
- drm_err(&i915->drm, "LMEM not initialized by firmware\n");
- return -ENODEV;
- }
-
if (GRAPHICS_VER(i915) > 5 && !intel_vgpu_active(i915))
uncore->flags |= UNCORE_HAS_FORCEWAKE;
--
2.41.0