On 2022-06-29 18:20, Felix Kuehling wrote:
On 2022-06-28 17:43, Eric Huang wrote:
Two changes:
1. reducing unnecessary evict/unmap when range is not mapped to gpu.
2. adding always evict when flags is set to always_mapped.
Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 4bf2f75f853b..76e817687ef9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1767,12 +1767,16 @@ svm_range_evict(struct svm_range *prange,
struct mm_struct *mm,
struct kfd_process *p;
int r = 0;
+ if (!prange->mapped_to_gpu)
+ return 0;
This feels like an unrelated optimization that should be in a separate
patch.
But I'm not sure this is correct, because it doesn't consider child
ranges. svm_range_unmap_from_gpus already contains this check, so
ranges should not be unmapped unnecessarily either way. Is there any
other benefit to this change that I'm missing?
I will send another patch separately that considers child ranges. The
benefit is it will reduce unnecessary queue evicts when allocating
context save memory, which is unmapped to gpu. It is for efficiency
reason. On the other hand, without this optimization
KFDCWSRTest.InterruptRestore fails with queue preemption error. I think
the reason is the extra queue evicts make HWS too busy to preempt
existing queues. There is one unmap_queue packet sent to HWS in current
code, and will be three unmap_queue packets with unified memory
allocation. So this optimization will keep only one unmap_queue as before.
Regards,
Eric
Regards,
Felix
+
p = container_of(svms, struct kfd_process, svms);
pr_debug("invalidate svms 0x%p prange [0x%lx 0x%lx] [0x%lx
0x%lx]\n",
svms, prange->start, prange->last, start, last);
- if (!p->xnack_enabled) {
+ if (!p->xnack_enabled ||
+ (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) {
int evicted_ranges;
list_for_each_entry(pchild, &prange->child_list,
child_list) {
@@ -3321,7 +3325,9 @@ svm_range_set_attr(struct kfd_process *p,
struct mm_struct *mm,
if (r)
goto out_unlock_range;
- if (migrated && !p->xnack_enabled) {
+ if (migrated && (!p->xnack_enabled ||
+ (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
+ prange->mapped_to_gpu) {
pr_debug("restore_work will update mappings of GPUs\n");
mutex_unlock(&prange->migrate_mutex);
continue;