Re: [PATCH] drm/amdkfd: Fix an issue at userptr buffer validation process.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023-04-20 18:25, Xiaogang.Chen wrote:
From: Xiaogang Chen <xiaogang.chen@xxxxxxx>

amdgpu_ttm_tt_get_user_pages can fail(-EFAULT). If it failed mem has no associated
hmm range or user_pages associated. Keep it at process_info->userptr_inval_list and
mark mem->invalid until following scheduled attempts can valid it.

Thank you! This patch looks good to me. One more nit-pick inline.



Signed-off-by: Xiaogang Chen <Xiaogang.Chen@xxxxxxx>
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 28 ++++++++++++++-----
  1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 7b1f5933ebaa..fad5183baf80 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2444,7 +2444,9 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info,
  			ret = -EAGAIN;
  			goto unlock_out;
  		}
-		mem->invalid = 0;
+		 /* set mem valid if mem has hmm range associated */
+		if (mem->range)
+			mem->invalid = 0;
  	}
unlock_out:
@@ -2576,16 +2578,28 @@ static int confirm_valid_user_pages_locked(struct amdkfd_process_info *process_i
  	list_for_each_entry_safe(mem, tmp_mem,
  				 &process_info->userptr_inval_list,
  				 validate_list.head) {
-		bool valid = amdgpu_ttm_tt_get_user_pages_done(
-				mem->bo->tbo.ttm, mem->range);
+		/* Only check mem with hmm range associated */
+		bool valid;
- mem->range = NULL;
-		if (!valid) {
-			WARN(!mem->invalid, "Invalid BO not marked invalid");
+		if (mem->range) {

You can avoid unnecessary nesting and make this more readable by inverting the condition:

		if (!mem->range)
			continue;

		valid = amdgpu_ttm_tt_get_user_pages_done(
				mem->bo->tbo.ttm, mem->range);
		...

With that fixed, the patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>


+			valid = amdgpu_ttm_tt_get_user_pages_done(
+					mem->bo->tbo.ttm, mem->range);
+
+			mem->range = NULL;
+			if (!valid) {
+				WARN(!mem->invalid, "Invalid BO not marked invalid");
+				ret = -EAGAIN;
+				continue;
+			}
+		} else
+			/* keep mem without hmm range at userptr_inval_list */
+			continue;
+
+		if (mem->invalid) {
+			WARN(1, "Valid BO is marked invalid");
  			ret = -EAGAIN;
  			continue;
  		}
-		WARN(mem->invalid, "Valid BO is marked invalid");
list_move_tail(&mem->validate_list.head,
  			       &process_info->userptr_valid_list);



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux