On 20/09/2021 08:38, Jani Nikula wrote:
On Mon, 20 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
On 18/09/2021 00:38, Matthew Brost wrote:
From: Hugh Dickins <hughd@xxxxxxxxxx>
5.15-rc1 crashes with blank screen when booting up on two ThinkPads
using i915. Bisections converge convincingly, but arrive at different
and surprising "culprits", none of them the actual culprit.
It is certainly surprising this patch crashed SNB and KBL.
How feasible would it be to make this code just not run when GuC is not
used? Given the field it adds is called ce->guc_blocked it sounds like a
natural and preferable thing to do... if possible.
netconsole (with init_netconsole() hacked to call i915_init() when
logging has started, instead of by module_init()) tells the story:
kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
function needs to be 4-byte aligned.
v2:
(Jani Nikula)
- Change BUG_ON to WARN_ON
However in this case the code would then go on and call into a wrong
function offset which may be worse than a BUG_ON, no?
So how about just
if (WARN_ON(...))
return;
or whatever is needed to give both the user and the CI a better
opportunity to see the error.
Sounds good to me.
Regards,
Tvrtko
BR,
Jani
Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
Signed-off-by: Matthew Brost <matthew.brost@xxxxxxxxx>
Reviewed-by: Matthew Brost <matthew.brost@xxxxxxxxx>
---
drivers/gpu/drm/i915/gt/intel_context.c | 1 +
drivers/gpu/drm/i915/i915_sw_fence.c | 4 +++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index ff637147b1a9..f02c2202da9d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -362,6 +362,7 @@ static int __intel_context_active(struct i915_active *active)
return 0;
}
+__aligned(4) /* Respect the I915_SW_FENCE_MASK */
Hugh suggested __i915_sw_fence_call which I think would be the right
thing to do.
Regards,
Tvrtko
static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
enum i915_sw_fence_notify state)
{
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..1217b124c1d0 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -14,8 +14,10 @@
#if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
#define I915_SW_FENCE_BUG_ON(expr) BUG_ON(expr)
+#define I915_SW_FENCE_WARN_ON(expr) WARN_ON(expr)
#else
#define I915_SW_FENCE_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
+#define I915_SW_FENCE_WARN_ON(expr) BUILD_BUG_ON_INVALID(expr)
#endif
static DEFINE_SPINLOCK(i915_sw_fence_lock);
@@ -242,7 +244,7 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
const char *name,
struct lock_class_key *key)
{
- BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
+ I915_SW_FENCE_WARN_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
__init_waitqueue_head(&fence->wait, name, key);
fence->flags = (unsigned long)fn;