On 12/18/2024 3:20 AM, Dmitry Baryshkov wrote:
On Tue, Dec 17, 2024 at 04:27:57PM -0800, Jessica Zhang wrote:
From: Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx>
There is no recovery mechanism in place yet to recover from mmu
faults for DPU. We can only prevent the faults by making sure there
is no misconfiguration.
Rate-limit the snapshot capture for mmu faults to once per
msm_atomic_commit_tail() as that should be sufficient to capture
the snapshot for debugging otherwise there will be a lot of DPU
snapshots getting captured for the same fault which is redundant
and also might affect capturing even one snapshot accurately.
Signed-off-by: Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx>
Signed-off-by: Jessica Zhang <quic_jesszhan@xxxxxxxxxxx>
---
drivers/gpu/drm/msm/msm_atomic.c | 2 ++
drivers/gpu/drm/msm/msm_kms.c | 5 ++++-
drivers/gpu/drm/msm/msm_kms.h | 3 +++
3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c
index 9c45d641b5212c11078ab38c13a519663d85e10a..9ad7eeb14d4336abd9d8a8eb1382bdddce80508a 100644
--- a/drivers/gpu/drm/msm/msm_atomic.c
+++ b/drivers/gpu/drm/msm/msm_atomic.c
@@ -228,6 +228,8 @@ void msm_atomic_commit_tail(struct drm_atomic_state *state)
if (kms->funcs->prepare_commit)
kms->funcs->prepare_commit(kms, state);
+ kms->fault_snapshot_capture = 0;
+
- Please move it before the prepare_commit().
- You are accessing the same variable from different threads / cores.
There should be some kind of a sync barrier.
Hi Dmitry,
Ack, will add a lock for the snapshot capture counter.
Thanks,
Jessica Zhang
/*
* Push atomic updates down to hardware:
*/
diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
index 78830e446355f77154fa21a5d107635bc88ba3ed..3327caf396d4fc905dc127f09515559c12666dc8 100644
--- a/drivers/gpu/drm/msm/msm_kms.c
+++ b/drivers/gpu/drm/msm/msm_kms.c
@@ -168,7 +168,10 @@ static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, void
{
struct msm_kms *kms = arg;
- msm_disp_snapshot_state(kms->dev);
+ if (!kms->fault_snapshot_capture) {
+ msm_disp_snapshot_state(kms->dev);
+ kms->fault_snapshot_capture++;
+ }
return -ENOSYS;
}
diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index e60162744c669773b6e5aef824a173647626ab4e..3ac089e26e14b824567f3cd2c62f82a1b9ea9878 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -128,6 +128,9 @@ struct msm_kms {
int irq;
bool irq_requested;
+ /* rate limit the snapshot capture to once per attach */
+ int fault_snapshot_capture;
+
/* mapper-id used to request GEM buffer mapped for scanout: */
struct msm_gem_address_space *aspace;
--
2.34.1
--
With best wishes
Dmitry