RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - AMD Internal Distribution Only]


Responses inline. Will post new patch after testing

 

Regards,

Ramesh

 

From: Kasiviswanathan, Harish <Harish.Kasiviswanathan@xxxxxxx>
Sent: Wednesday, August 28, 2024 1:37 PM
To: Yang, Philip <Philip.Yang@xxxxxxx>; Errabolu, Ramesh <Ramesh.Errabolu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

 

[AMD Official Use Only - AMD Internal Distribution Only]

 

Some comments inline.

 

From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Philip Yang
Sent: Wednesday, August 28, 2024 11:49 AM
To: Errabolu, Ramesh <Ramesh.Errabolu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

 

 

On 2024-08-26 15:34, Ramesh Errabolu wrote:

Enables users to update the default size of buffer used
in migration either from Sysmem to VRAM or vice versa.
The param GOBM refers to granularity of buffer migration,
 
[HK]: Include the module param name here. 
[Ramesh] Changed param name to svm_default_granularity. It is included as part of the header.
 
and is specified in terms of log(numPages(buffer)). It
facilitates users of unregistered memory to control GOBM,

it is used for both registered and unregistered range cases.

Ramesh: Removed reference to unregistered memory as it is technically possible to update granularity by calling set_svm_attr() ioctl

 
albeit at a coarse level
 
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@xxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 18 +++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 12 ++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c    | 26 ++++++++++++++++---------
 4 files changed, 51 insertions(+), 9 deletions(-)
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e8c284aea1f2..73dd816b01f2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -237,6 +237,7 @@ extern int sched_policy;
 extern bool debug_evictions;
 extern bool no_system_mem_limit;
 extern int halt_if_hws_hang;
+extern uint amdgpu_svm_attr_gobm;
 #else
 static const int __maybe_unused sched_policy = KFD_SCHED_POLICY_HWS;
 static const bool __maybe_unused debug_evictions; /* = false */
@@ -313,6 +314,9 @@ extern int amdgpu_wbrf;
 /* Extra time delay(in ms) to eliminate the influence of temperature momentary fluctuation */
 #define AMDGPU_SWCTF_EXTRA_DELAY              50
 
+/* Default size of buffer to use in migrating buffer */
+#define AMDGPU_SVM_ATTR_GOBM       9
+
 
[HK]: Don’t think this #define is need. You can just use 9 directly as other module parameters for default value.
[Ramesh] Removed it
 
 struct amdgpu_xcp_mgr;
 struct amdgpu_device;
 struct amdgpu_irq_src;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b9529948f2b2..09c501753a3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -169,6 +169,17 @@ uint amdgpu_sdma_phase_quantum = 32;
 char *amdgpu_disable_cu;
 char *amdgpu_virtual_display;
 bool enforce_isolation;
+
+/* Specifies the default size of buffer to use in
+ * migrating buffer from Sysmem to VRAM and vice
+ * versa
+ *
+ * GOBM - Granularity of Buffer Migration
+ *
+ * Defined as log2(sizeof(buffer)/PAGE_SIZE)
 
[HK]: Use of log is unnecessary here and other places. You can just state is size is exponent of 2. 
[Ramesh] Updated comment to state that this symbol server two purposes – buffer migration and page fault handling. Value of granularity is defined as log(2^Pages) so that aspect has to be highlighted. How this is interpreted, including any scaling is opaque
 
+ */
+uint amdgpu_svm_attr_gobm = AMDGPU_SVM_ATTR_GOBM;
+
 /*
  * OverDrive(bit 14) disabled by default
  * GFX DCS(bit 19) disabled by default
@@ -320,6 +331,13 @@ module_param_named(pcie_gen2, amdgpu_pcie_gen2, int, 0444);
 MODULE_PARM_DESC(msi, "MSI support (1 = enable, 0 = disable, -1 = auto)");
 module_param_named(msi, amdgpu_msi, int, 0444);
 
+/**
+ * DOC: svm_attr_gobm (uint)
+ * Size of buffer to use in migrating buffer from Sysmem to VRAM and vice versa
+ */
+MODULE_PARM_DESC(svm_attr_gobm, "Defined as log2(sizeof(buffer)/PAGE_SIZE), e.g. 9 for 2 MiB");
+module_param_named(svm_attr_gobm, amdgpu_svm_attr_gobm, uint, 0644);
+
 /**
  * DOC: lockup_timeout (string)
  * Set GPU scheduler timeout value in ms.
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 9ae9abc6eb43..c2e54b18c167 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -868,6 +868,18 @@ struct svm_range_list {
  struct task_struct       *faulting_task;
  /* check point ts decides if page fault recovery need be dropped */
  uint64_t                 checkpoint_ts[MAX_GPU_INSTANCE];
+
+ /* Indicates the default size to use in migrating
+  * buffers of a process from Sysmem to VRAM and vice
+  * versa. The max legal value cannot be greater than
+  * 0x3F
+  *
+  * @note: A side effect of this symbol being part of
+  * struct svm_range_list is that it forces all buffers
+  * of the process of unregistered kind to use the same
+  * size in buffer migration
+  */

The comment is good enough, note maybe misleading, not needed.

[Ramesh]: Removed references to unregistered memory users in the NOTE section.

 

 
+ uint8_t attr_gobm;
 };
 
 /* Process data */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index b44dec90969f..78c78baddb1f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -309,12 +309,11 @@ static void svm_range_free(struct svm_range *prange, bool do_unmap)
 }
 
 static void
-svm_range_set_default_attributes(int32_t *location, int32_t *prefetch_loc,
-                         uint8_t *granularity, uint32_t *flags)
+svm_range_set_default_attributes(int32_t *location,
+                int32_t *prefetch_loc, uint32_t *flags)
 {
  *location = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
  *prefetch_loc = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
- *granularity = 9;

as svm_range_set_default_attributes is called in multiple places, add new parameter struct svm_range_list *svms, to remove the duplicate code.

Ramesh: Restored the original function signature i.e. set granularity inside the set_default_attributes() function

*granularity = svms->attr_gobm;
 
  *flags =
         KFD_IOCTL_SVM_FLAG_HOST_ACCESS | KFD_IOCTL_SVM_FLAG_COHERENT;
 }
@@ -358,9 +357,9 @@ svm_range *svm_range_new(struct svm_range_list *svms, uint64_t start,
         bitmap_copy(prange->bitmap_access, svms->bitmap_supported,
                     MAX_GPU_INSTANCE);
 
+ prange->granularity = svms->attr_gobm;
  svm_range_set_default_attributes(&prange->preferred_loc,
-                                &prange->prefetch_loc,
-                                &prange->granularity, &prange->flags);
+                        &prange->prefetch_loc, &prange->flags);
 
  pr_debug("svms 0x%p [0x%llx 0x%llx]\n", svms, start, last);
 
@@ -2693,10 +2692,12 @@ svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr,
 
  *is_heap_stack = vma_is_initial_heap(vma) || vma_is_initial_stack(vma);
 
+ /* Determine the starting and ending page of prange */

this comment is not helpful.

Ramesh: removed it

 
  start_limit = max(vma->vm_start >> PAGE_SHIFT,
-              (unsigned long)ALIGN_DOWN(addr, 2UL << 8));
+              (unsigned long)ALIGN_DOWN(addr, 1 << p->svms.attr_gobm));

1UL << p->svms.attr_gobm

Ramesh: done

  end_limit = min(vma->vm_end >> PAGE_SHIFT,
-            (unsigned long)ALIGN(addr + 1, 2UL << 8));
+            (unsigned long)ALIGN(addr + 1, 1 << p->svms.attr_gobm));

1UL << p->svms.attr_gobm

Ramesh: done

Regards,

Philip

 
+
  /* First range that starts after the fault address */
  node = interval_tree_iter_first(&p->svms.objects, addr + 1, ULONG_MAX);
  if (node) {
@@ -3240,6 +3241,12 @@ int svm_range_list_init(struct kfd_process *p)
         if (KFD_IS_SVM_API_SUPPORTED(p->pdds[i]->dev->adev))
                 bitmap_set(svms->bitmap_supported, i, 1);
 
+ /* Bind granularity of buffer migration, either
+  * the default size or one specified by the user
+  */
+ svms->attr_gobm = min_t(u8, amdgpu_svm_attr_gobm, 0x3F);
+ pr_debug("Granularity Of Buffer Migration: %d\n", svms->attr_gobm);
+
  return 0;
 }
 
@@ -3767,8 +3774,9 @@ svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm,
  node = interval_tree_iter_first(&svms->objects, start, last);
  if (!node) {
         pr_debug("range attrs not found return default values\n");
-   svm_range_set_default_attributes(&location, &prefetch_loc,
-                                        &granularity, &flags_and);
+        granularity = svms->attr_gobm;
+   svm_range_set_default_attributes(&location,
+                          &prefetch_loc, &flags_and);
         flags_or = flags_and;
         if (p->xnack_enabled)
                 bitmap_copy(bitmap_access, svms->bitmap_supported,

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux