Re: [PATCH v2 6/7] drm/panfrost: Add support for GPU heap allocations

Steven Price <steven.price@xxxxxxx> · Wed, 31 Jul 2019 10:30:51 +0100

On 30/07/2019 21:03, Rob Herring wrote:
> On Thu, Jul 25, 2019 at 9:35 AM Steven Price <steven.price@xxxxxxx> wrote:
>>
>> On 25/07/2019 15:59, Steven Price wrote:
>> [...]
>>> It would appear that in the following call sgt==NULL:
>>>>      ret = sg_alloc_table_from_pages(sgt, pages + page_offset,
>>>>                                      NUM_FAULT_PAGES, 0, SZ_2M, GFP_KERNEL);
>>>
>>> Which means we've ended up with a BO with bo->sgt==NULL, bo->pages set
>>> and bo->is_heap=true. My understanding is this should be impossible.
>>>
>>> I haven't yet figured out how this happens - it seems to be just before
>>> termination, so it might be a race with cleanup?
>>
>> That was a red herring - it's partly my test case doing something a bit
>> weird. This crash is caused by doing an mmap of a HEAP object before any
>> fault has occurred.
>>
>> drm_gem_shmem_mmap() calls drm_gem_shmem_get_pages() which will populate
>> bo->base.pages (even if bo->is_heap).
>>
>> Either we should prevent mapping of HEAP objects, or alternatively
>> bo->base.pages could be allocated upfront instead of during the first
>> fault. My preference would be allocating it upfront because optimising
>> for the case of a HEAP BO which isn't used seems a bit weird. Although
>> there's still the question of exactly what the behaviour should be of
>> accessing through the CPU pages which haven't been allocated yet.
> 
> As preventing getting the mmap fake offset should be sufficient, I'm
> planning on doing this change:
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 746eb4603bc2..186d5db892a9 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -270,6 +270,10 @@ static int panfrost_ioctl_mmap_bo(struct
> drm_device *dev, void *data,
>                 return -ENOENT;
>         }
> 
> +       /* Don't allow mmapping of heap objects as pages are not pinned. */
> +       if (to_panfrost_bo(gem_obj)->is_heap))
> +               return -EINVAL;
> +
>         ret = drm_gem_create_mmap_offset(gem_obj);
>         if (ret == 0)
>                 args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
> 

Seems reasonable to me - we can always add support for mmap()ing at a
later date if it becomes useful.

>> Also shmem->pages_use_count needs incrementing to stop
>> drm_gem_shmem_get_pages() replacing bo->base.pages. I haven't tested
>> what happens if you mmap *after* the first fault.
> 
> I set pages_use_count to 1 when we allocate pages on the first fault.
> Or do you mean we need to set it on creation in case
> drm_gem_shmem_get_pages() is called before any GPU faults?
> 
> Either way, that just shifts how/where we crash I think. We need to
> prevent drm_gem_shmem_get_pages() from being called. Besides mmap, the
> other cases are vmap and exporting. I don't think we have any paths
> that will cause vmap to get called in our case. For exporting, perhaps
> we need a wrapper around drm_gem_shmem_pin() to prevent it.

Yes, either every path to drm_gem_shmem_get_pages() needs to be blocked
for HEAP objects, or you need to set pages_use_count to 1 when you
allocate (which then means drm_gem_shmem_get_pages() will simply
increment the ref-count).

Of course if you are leaving any calls to drm_gem_shmem_get_pages()
reachable then you also need to ensure that the code that follows
understands how to deal with a sparse bo->pages array. Exporting would
be a good example - and again I suspect just preventing it is fine for now.

Steve
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel