Re: [PATCH] drm/amdgpu: Enable SA software trap.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2022-09-22 um 13:57 schrieb Belanger, David:
[AMD Official Use Only - General]



-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Sent: Thursday, September 22, 2022 1:14 PM
To: Belanger, David <David.Belanger@xxxxxxx>; amd-
gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Cornwall, Jay <Jay.Cornwall@xxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: Enable SA software trap.

Am 2022-09-22 um 12:17 schrieb David Belanger:
Enables support for software trap for MES >= 4.
Adapted from implementation from Jay Cornwall.

v2: Add IP version check in conditions.

Signed-off-by: Jay Cornwall <Jay.Cornwall@xxxxxxx>
Signed-off-by: David Belanger <david.belanger@xxxxxxx>
Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
---
   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c        |   6 +-
   .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h    | 771 +++++++++---------
   .../amd/amdkfd/cwsr_trap_handler_gfx10.asm    |  21 +
   .../gpu/drm/amd/amdkfd/kfd_int_process_v11.c  |  26 +-
   4 files changed, 437 insertions(+), 387 deletions(-)
[snip]
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
index a6fcbeeb7428..4e03d19e9333 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
@@ -358,13 +358,35 @@ static void event_interrupt_wq_v11(struct
kfd_dev *dev,
   				break;
   			case SQ_INTERRUPT_WORD_ENCODING_ERROR:
   				print_sq_intr_info_error(context_id0,
context_id1);
+				sq_int_priv = REG_GET_FIELD(context_id0,
+
	SQ_INTERRUPT_WORD_WAVE_CTXID0, PRIV);
   				sq_int_errtype =
REG_GET_FIELD(context_id0,
	SQ_INTERRUPT_WORD_ERROR_CTXID0, TYPE);
-				if (sq_int_errtype !=
SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST &&
-				    sq_int_errtype !=
SQ_INTERRUPT_ERROR_TYPE_MEMVIOL) {
+
+				switch (sq_int_errtype) {
+				case SQ_INTERRUPT_ERROR_TYPE_EDC_FUE:
+				case SQ_INTERRUPT_ERROR_TYPE_EDC_FED:

	event_interrupt_poison_consumption_v11(
   							dev, pasid,
source_id);
   					return;
+				case
SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST:
+					/*if (!(((adev->mes.sched_version &
AMDGPU_MES_VERSION_MASK) >= 4) &&
+						  (adev-
ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) &&
+						  (adev-
ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3)))
+						&& sq_int_priv)
+
	kfd_set_dbg_ev_from_interrupt(dev, pasid, -1,
+
	KFD_EC_MASK(EC_QUEUE_WAVE_ILLEGAL_INSTRUCTION),
+							NULL, 0);*/
+					return;
+				case
SQ_INTERRUPT_ERROR_TYPE_MEMVIOL:
+					/*if (!(((adev->mes.sched_version &
AMDGPU_MES_VERSION_MASK) >= 4) &&
+						  (adev-
ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) &&
+						  (adev-
ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3)))
+						&& sq_int_priv)
+
	kfd_set_dbg_ev_from_interrupt(dev, pasid, -1,
+
	KFD_EC_MASK(EC_QUEUE_WAVE_MEMORY_VIOLATION),
+							NULL, 0);*/
Which branch is this for? kfd_set_dbg_ev_from_interrupt shouldn't exist on
the upstream branch yet. That code is still under review for upstream.

My understanding is that it is for branch amd-staging-drm-next to make its way upstream.
The code that calls that function is commented out.  There are other pre-existing instances in that file in amd-staging-drm-next branch that are commented out also with that function.
Please advise if I should remove it from the patch for now or keep it as commented out.

I'd prefer not to check in commented-out code to the upstream branch. Please work with Jon to make sure he includes this in his rocm-gdb patch series, where these changes belong. And you can submit them to the DKMS branch as a separate patch in the interim.

Thanks,
  Felix



Thanks,
David B.

Regards,
    Felix


+					return;
   				}
   				break;
   			default:



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux