On 12/6/2022 3:49 PM, Andrzej Hajda wrote:
CSB FIFOs stores 64-bit Context Status Buffers used by GuC firmware. They
are accessed by 32-bit register. Reads must occur in pairs to obtain
a single 64-bit CSB entry. The second read pops the CSB entry off the FIFO.
In case GuC reset happens between the reads, FIFO must be read once, to
recover proper behaviour.
From the description, this seems to be a bug in the GuC firmware. The
firmware is supposed to make sure all stale CSB entries are discarded
when it gets reloaded, so it looks like the issue here is that that code
is not correctly handling the case where there are an odd number of
stale dwords. All the registers you're reading in this patch are GuC
registers, so it should be possible to implement this fix within the
firmware.
That said, since we do need to keep support for current/older GuC
versions, we'll still need to merge this WA and then disable it when we
detect that we're loading a GuC version that includes the fix. However,
I'd prefer it if we could first get confirmation that there is indeed a
bug in the stale CSB handling inside of GuC (I believe Antonio is
already looking at that) and that this is the best way to WA the issue,
because normally we try to avoid touching internal GuC regs from i915
unless there are no alternatives.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7351
Signed-off-by: Andrzej Hajda <andrzej.hajda@xxxxxxxxx>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 25 ++++++++++++++++++++++
drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h | 13 +++++++++++
2 files changed, 38 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 24736ebee17c28..8e64b9024e3258 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -721,6 +721,30 @@ bool intel_has_reset_engine(const struct intel_gt *gt)
return INTEL_INFO(gt->i915)->has_reset_engine;
}
+static void recover_csb_fifos(struct intel_gt *gt)
+{
+ const struct {
+ u32 bit;
+ i915_reg_t csb;
+ } csb_map[] = {
+ { .bit = GUC_CSB_READ_FLAG_RCS, .csb = GUC_CS_CSB },
+ { .bit = GUC_CSB_READ_FLAG_VCS, .csb = GUC_VCS_CSB },
+ { .bit = GUC_CSB_READ_FLAG_VECS, .csb = GUC_VECS_CSB },
+ { .bit = GUC_CSB_READ_FLAG_BCS, .csb = GUC_BCS_CSB },
+ { .bit = GUC_CSB_READ_FLAG_CCS, .csb = GUC_CCS_CSB },
For MTL we'd also need the GSC_CSB, but hopefully we can get the updated
GuC before we remove the force_probe and therefore not have to support
this WA on MTL.
+ };
+ u32 dbg;
+
+ if (!intel_uc_uses_guc_submission(>->uc))
+ return;
The GuC still gets the CSB interrupts even if we're not using GuC
submission, although in that case it just pops the CSB entries out of
the FIFO without looking at them. Not sure if we still need the WA in
that case (again need input from the GuC side).
Daniele
+
+ dbg = intel_uncore_read(gt->uncore, GUCINT_DEBUG2);
+ for (int i = 0; i < ARRAY_SIZE(csb_map); ++i) {
+ if (dbg & csb_map[i].bit)
+ intel_uncore_read(gt->uncore, csb_map[i].csb);
+ }
+}
+
int intel_reset_guc(struct intel_gt *gt)
{
u32 guc_domain =
@@ -731,6 +755,7 @@ int intel_reset_guc(struct intel_gt *gt)
intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
ret = gen6_hw_domain_reset(gt, guc_domain);
+ recover_csb_fifos(gt);
intel_uncore_forcewake_put(gt->uncore, FORCEWAKE_ALL);
return ret;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
index 9915de32e894e1..beeb7fbff99453 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
@@ -154,4 +154,17 @@ struct guc_doorbell_info {
#define GUC_INTR_SW_INT_1 BIT(1)
#define GUC_INTR_SW_INT_0 BIT(0)
+#define GUCINT_DEBUG2 _MMIO(0xC5A4)
+#define GUC_CSB_READ_FLAG_CCS BIT(16)
+#define GUC_CSB_READ_FLAG_BCS BIT(3)
+#define GUC_CSB_READ_FLAG_VECS BIT(2)
+#define GUC_CSB_READ_FLAG_VCS BIT(1)
+#define GUC_CSB_READ_FLAG_RCS BIT(0)
+
+#define GUC_CS_CSB _MMIO(0xC5B0)
+#define GUC_BCS_CSB _MMIO(0xC5B4)
+#define GUC_VCS_CSB _MMIO(0xC5B8)
+#define GUC_VECS_CSB _MMIO(0xC5BC)
+#define GUC_CCS_CSB _MMIO(0xC5E0)
+
#endif