Re: [PATCH] drm/amdgpu: add option params to enforce process isolation between graphics and compute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 01.06.23 um 13:14 schrieb Chong Li:
enforce process isolation between graphics and compute via using the same reserved vmid.

Signed-off-by: Chong Li <chongli2@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  9 +++++++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 13 ++++++++++++-
  3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ce196badf42d..48c5c547d85a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -215,6 +215,7 @@ extern int amdgpu_force_asic_type;
  extern int amdgpu_smartshift_bias;
  extern int amdgpu_use_xgmi_p2p;
  extern int amdgpu_mtype_local;
+extern int enforce_isolation;
  #ifdef CONFIG_HSA_AMD
  extern int sched_policy;
  extern bool debug_evictions;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3d91e123f9bd..2e0ebd92b4cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -973,6 +973,15 @@ MODULE_PARM_DESC(
  						4 = AMDGPU_CPX_PARTITION_MODE)");
  module_param_named(user_partt_mode, amdgpu_user_partt_mode, uint, 0444);
+
+/**
+ * DOC: enforce_isolation (int)
+ * enforce process isolation between graphics and compute via using the same reserved vmid.
+ */
+int enforce_isolation = 0;

Please move that to the other declarations above.

+module_param(enforce_isolation, int, 0444);

IIRC you can also use bool here.

+MODULE_PARM_DESC(enforce_isolation, "enforce process isolation between graphics and compute . 1 = On, 0 = Off");

This way you can drop the "1 = On, 0 = Off" part from the description because "enforce_isolation=on" should then be accepted on the kernel commandline as well.

+
  /* These devices are not supported by amdgpu.
   * They are supported by the mach64, r128, radeon drivers
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index c991ca0b7a1c..33efa17d08ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -409,7 +409,7 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
  	if (r || !idle)
  		goto error;
- if (vm->reserved_vmid[vmhub]) {
+	if (vm->reserved_vmid[vmhub] || (enforce_isolation && (vmhub == AMDGPU_GFXHUB(0)))) {
  		r = amdgpu_vmid_grab_reserved(vm, ring, job, &id, fence);
  		if (r || !id)
  			goto error;
@@ -578,6 +578,17 @@ void amdgpu_vmid_mgr_init(struct amdgpu_device *adev)
  			list_add_tail(&id_mgr->ids[j].list, &id_mgr->ids_lru);
  		}
  	}
+
+	if (enforce_isolation) {
+		struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[AMDGPU_GFXHUB(0)];
+		struct amdgpu_vmid *id = NULL;

Empty line between declaration and code please.

+		++id_mgr->reserved_use_count;
+		id = list_first_entry(&id_mgr->ids_lru, struct amdgpu_vmid,
+					list);
+		/* Remove from normal round robin handling */
+		list_del_init(&id->list);
+		id_mgr->reserved = id;

It would be good if we don't duplicate this hunk here and in amdgpu_vmid_alloc_reserved().

We should probably cleanup amdgpu_vmid_alloc_reserved() a bit and move the check for vm->reserved_vmid into amdgpu_vm_ioctl().

This way we could call amdgpu_vmid_alloc_reserved() here as well.

Apart from that looks good from the technical side.

Regards,
Christian.

+	}
  }
/**




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux