On 2023-09-13 12:14, Felix Kuehling
wrote:
On 2023-09-13 11:16, Philip Yang wrote:
If new range is added to update list, splited to multiple pranges with
max_svm_range_pages alignment, and svm validate and map returns error
for the first prange, then the caller retry should add pranges with
prange->is_error_flag or prange without prange->mapped_to_gpu to the
update list, to update GPU mapping for the entire range.
It looks like the only new thing here is to remove the "same attribute" optimization for ranges that are not mapped on the GPU. I don't fully understand the scenario you're describing here, but it feels like this change has a bigger impact than it needs to have. Your description specifically talks about ranges split at max_svm_range_pages boundaries. But your patch affects all ranges not mapped on the GPU, even it prange->is_error_flag is not set.
Maybe that's OK, because the expensive thing is updating existing mappings unnecessarily. If there is no existing mapping yet, it's probably not a big deal. I just don't understand the scenario that requires a retry without the prange->is_error_flag being set. Maybe a better fix would be to ensure that prange->is_error_flag gets set in your scenario.
Another way to fix the issue, set_attr continue to call svm_range_validate_and_map for all pranges of update_list after svm_range_validate_and_map return error, this change will have less side-effect.
Regards,
Philip
Regards,
Felix
Fixes: c22b04407097 ("drm/amdkfd: flag added to handle errors from svm validate and map")
Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx>
Tested-by: James Zhu <james.zhu@xxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 61dd66bddc3c..8871329e9cbd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -825,7 +825,7 @@ svm_range_is_same_attrs(struct kfd_process *p, struct svm_range *prange,
}
}
- return !prange->is_error_flag;
+ return true;
}
/**
@@ -2228,7 +2228,8 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
next = interval_tree_iter_next(node, start, last);
next_start = min(node->last, last) + 1;
- if (svm_range_is_same_attrs(p, prange, nattr, attrs)) {
+ if (!prange->is_error_flag && prange->mapped_to_gpu &&
+ svm_range_is_same_attrs(p, prange, nattr, attrs)) {
/* nothing to do */
} else if (node->start < start || node->last > last) {
/* node intersects the update range and its attributes