Re: [PATCH] drm/i915/ttm: Fix access_memory null pointer exception

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14/10/2022 11:38, Das, Nirmoy wrote:
Hi Matt,

On 10/14/2022 12:13 PM, Matthew Auld wrote:
On 14/10/2022 10:27, Das, Nirmoy wrote:
Hi Matt

On 10/14/2022 10:39 AM, Matthew Auld wrote:
On 13/10/2022 18:56, Jonathan Cavitt wrote:
i915_ttm_to_gem can return a NULL pointer, which is
dereferenced in i915_ttm_access_memory without first
checking if it is NULL.  Inspecting
i915_ttm_io_mem_reserve, it appears the correct
behavior in this case is to return -EINVAL.

The GEM object has already been dereferenced before this point, if you look at the caller (vm_access_ttm). The NULL obj thing is to identify "ttm ghost objects", and I don't think a normal userpace object can suddenly become one (access_memory comes from ptrace). AFAIK ghost objects are just for temporarily hanging on to some memory/state, while the dma-resv is busy. In the places where ttm is the one giving us the object, then it might be possible to see these types of objects, since ttm could in theory pass one in (like during eviction).


Yes, we should not hit this.  Thanks for the nice "ttm ghost objects" reminder :)


I think we can still have this check to avoid code analysis tool warnings, what do you think ?

IMHO I think it just makes it harder to understand the code, since conceptually it should be impossible, given how "ghost objects" actually work. Adding such a check gives the impression that it is somehow now possible to be given one here (like with eviction etc). AFAIK just letting it crash is fine, instead of littering the code with NULL checks for stuff that is never meant to be NULL and would be a driver bug. Also there are a bunch of other places not checking that i915_ttm_to_gem() returns NULL, so why just here?

This is tricky because some place we might receive NULL and some other places we might not(from i915_ttm_to_gem). I also don't like the idea of sprinkling NULL check everywhere.

I think the issue is i915_ttm_to_gem  returns NULL for non-i915 BO. We should move "if (bo->destroy != i915_ttm_bo_destroy)" check to the respective function where we

expect ghost object. That should make the static code analyzer happy and also makes it very clear which function expect ghost objects.

Yeah, that sounds like a really nice idea to me. amdgpu looks to have something like amdgpu_bo_is_amdgpu_bo() for the spots that might be "ghost objects". Maybe we can add something like i915_ttm_is_ghost_bo() or similar for our needs.



Did the code analysis tool find something? Also why doesn't it complain about vm_access_ttm(), which is the one actually calling access_memory() and is itself also doing i915_ttm_to_gem() and also not checking for NULL?


Yes, I think the patch idea came from our static code analyzer warning but I can't seem to open the URL. I am also not sure why it doesn't complain for other cases.


Thanks,

Nirmoy




Thanks,

Nirmoy



Fixes: 26b15eb0 ("drm/i915/ttm: implement access_memory")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx>
Suggested-by: John C Harrison <John.C.Harrison@xxxxxxxxx>
CC: Matthew Auld <matthew.auld@xxxxxxxxx>
CC: Andrzej Hajda <andrzej.hajda@xxxxxxxxx>
CC: Nirmoy Das <nirmoy.das@xxxxxxxxx>
CC: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx>
---
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 9 +++++++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index d63f30efd631..b569624f2ed9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -704,11 +704,16 @@ static int i915_ttm_access_memory(struct ttm_buffer_object *bo,
                    int len, int write)
  {
      struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
-    resource_size_t iomap = obj->mm.region->iomap.base -
-        obj->mm.region->region.start;
+    resource_size_t iomap;
      unsigned long page = offset >> PAGE_SHIFT;
      unsigned long bytes_left = len;
  +    if (!obj)
+        return -EINVAL;
+
+    iomap = obj->mm.region->iomap.base -
+        obj->mm.region->region.start;
+
      /*
       * TODO: For now just let it fail if the resource is non-mappable,        * otherwise we need to perform the memcpy from the gpu here, without



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux