Re: Lockdep spalt on killing a processes

Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> · Thu, 28 Oct 2021 13:26:26 -0400

On 2021-10-27 3:58 p.m., Andrey Grodzovsky wrote:

On 2021-10-27 10:50 a.m., Christian König wrote:
Am 27.10.21 um 16:47 schrieb Andrey Grodzovsky:

On 2021-10-27 10:34 a.m., Christian König wrote:
Am 27.10.21 um 16:27 schrieb Andrey Grodzovsky:
[SNIP]

Let me please know if I am still missing some point of yours.

Well, I mean we need to be able to handle this for all drivers.


For sure, but as i said above in my opinion we need to change only 
for those drivers that don't use the _locked version.

And that absolutely won't work.

See the dma_fence is a contract between drivers, so you need the 
same calling convention between all drivers.

Either we always call the callback with the lock held or we always 
call it without the lock, but sometimes like that and sometimes 
otherwise won't work.

Christian.


I am not sure I fully understand what problems this will cause but 
anyway, then we are back to irq_work. We cannot embed irq_work as 
union within dma_fenc's cb_list
because it's already reused as timestamp and as rcu head after the 
fence is signaled. So I will do it within drm_scheduler with single 
irq_work per drm_sched_entity
as we discussed before.

That won't work either. We free up the entity after the cleanup 
function. That's the reason we use the callback on the job in the 
first place.


Yep, missed it.



We could overlead the cb structure in the job though.


I guess, since no one else is using this member it after the cb executed.

Andrey


Attached a patch. Give it a try please, I tested it on my side and tried 
to generate the right conditions to trigger this code path by repeatedly 
submitting commands while issuing GPU reset to stop the scheduler and 
then killing command submissions process in the middle. But for some 
reason looks like the job_queue was always empty already at the time of 
entity kill.

Andrey





Christian.


Andrey




Andrey


>From 8ba5c089939b79a6567411c33d4db40e5846eef3 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
Date: Thu, 28 Oct 2021 12:24:03 -0400
Subject: [PATCH] drm/sched: Avoid lockdep spalt on killing a processes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Probelm:
Singlaning one sched fence from within another's sched
fence singal callback generates lockdep splat because
the both have same lockdep class of their fence->lock

Fix:
Fix bellow stack by rescheduling to irq work of
signaling and killing of jobs that left when entity is killed.

[11176.741181]  dump_stack+0x10/0x12
[11176.741186] __lock_acquire.cold+0x208/0x2df
[11176.741197]  lock_acquire+0xc6/0x2d0
[11176.741204]  ? dma_fence_signal+0x28/0x80
[11176.741212] _raw_spin_lock_irqsave+0x4d/0x70
[11176.741219]  ? dma_fence_signal+0x28/0x80
[11176.741225]  dma_fence_signal+0x28/0x80
[11176.741230] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741240] drm_sched_entity_kill_jobs_cb+0x1c/0x50 [gpu_sched]
[11176.741248] dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741254]  dma_fence_signal+0x3b/0x80
[11176.741260] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[11176.741268] drm_sched_job_done.isra.0+0x7f/0x1a0 [gpu_sched]
[11176.741277] drm_sched_job_done_cb+0x12/0x20 [gpu_sched]
[11176.741284] dma_fence_signal_timestamp_locked+0xac/0x1a0
[11176.741290]  dma_fence_signal+0x3b/0x80
[11176.741296] amdgpu_fence_process+0xd1/0x140 [amdgpu]
[11176.741504] sdma_v4_0_process_trap_irq+0x8c/0xb0 [amdgpu]
[11176.741731]  amdgpu_irq_dispatch+0xce/0x250 [amdgpu]
[11176.741954]  amdgpu_ih_process+0x81/0x100 [amdgpu]
[11176.742174]  amdgpu_irq_handler+0x26/0xa0 [amdgpu]
[11176.742393] __handle_irq_event_percpu+0x4f/0x2c0
[11176.742402] handle_irq_event_percpu+0x33/0x80
[11176.742408]  handle_irq_event+0x39/0x60
[11176.742414]  handle_edge_irq+0x93/0x1d0
[11176.742419]  __common_interrupt+0x50/0xe0
[11176.742426]  common_interrupt+0x80/0x90

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
Suggested-by: Daniel Vetter  <daniel.vetter@xxxxxxxx>
Suggested-by: Christian König <christian.koenig@xxxxxxx>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 15 ++++++++++++---
 include/drm/gpu_scheduler.h              | 12 +++++++++++-
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 27e1573af96e..191c56064f19 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -190,6 +190,16 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 }
 EXPORT_SYMBOL(drm_sched_entity_flush);
 
+static void drm_sched_entity_kill_jobs_irq_work(struct irq_work *wrk)
+{
+	struct drm_sched_job *job = container_of(wrk, typeof(*job), work);
+
+	drm_sched_fence_finished(job->s_fence);
+	WARN_ON(job->s_fence->parent);
+	job->sched->ops->free_job(job);
+}
+
+
 /* Signal the scheduler finished fence when the entity in question is killed. */
 static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 					  struct dma_fence_cb *cb)
@@ -197,9 +207,8 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 	struct drm_sched_job *job = container_of(cb, struct drm_sched_job,
 						 finish_cb);
 
-	drm_sched_fence_finished(job->s_fence);
-	WARN_ON(job->s_fence->parent);
-	job->sched->ops->free_job(job);
+	init_irq_work(&job->work, drm_sched_entity_kill_jobs_irq_work);
+	irq_work_queue(&job->work);
 }
 
 static struct dma_fence *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index f011e4c407f2..bbc22fad8d80 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -28,6 +28,7 @@
 #include <linux/dma-fence.h>
 #include <linux/completion.h>
 #include <linux/xarray.h>
+#include <linux/irq_work.h>
 
 #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
 
@@ -286,7 +287,16 @@ struct drm_sched_job {
 	struct list_head		list;
 	struct drm_gpu_scheduler	*sched;
 	struct drm_sched_fence		*s_fence;
-	struct dma_fence_cb		finish_cb;
+
+	/*
+	 * work is used only after finish_cb has been used and will not be
+	 * accessed anymore.
+	 */
+	union {
+		struct dma_fence_cb		finish_cb;
+		struct irq_work 		work;
+	};
+
 	uint64_t			id;
 	atomic_t			karma;
 	enum drm_sched_priority		s_priority;
-- 
2.25.1