On 30/07/2019 21:03, Rob Herring wrote: > On Thu, Jul 25, 2019 at 9:35 AM Steven Price <steven.price@xxxxxxx> wrote: >> >> On 25/07/2019 15:59, Steven Price wrote: >> [...] >>> It would appear that in the following call sgt==NULL: >>>> ret = sg_alloc_table_from_pages(sgt, pages + page_offset, >>>> NUM_FAULT_PAGES, 0, SZ_2M, GFP_KERNEL); >>> >>> Which means we've ended up with a BO with bo->sgt==NULL, bo->pages set >>> and bo->is_heap=true. My understanding is this should be impossible. >>> >>> I haven't yet figured out how this happens - it seems to be just before >>> termination, so it might be a race with cleanup? >> >> That was a red herring - it's partly my test case doing something a bit >> weird. This crash is caused by doing an mmap of a HEAP object before any >> fault has occurred. >> >> drm_gem_shmem_mmap() calls drm_gem_shmem_get_pages() which will populate >> bo->base.pages (even if bo->is_heap). >> >> Either we should prevent mapping of HEAP objects, or alternatively >> bo->base.pages could be allocated upfront instead of during the first >> fault. My preference would be allocating it upfront because optimising >> for the case of a HEAP BO which isn't used seems a bit weird. Although >> there's still the question of exactly what the behaviour should be of >> accessing through the CPU pages which haven't been allocated yet. > > As preventing getting the mmap fake offset should be sufficient, I'm > planning on doing this change: > > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c > b/drivers/gpu/drm/panfrost/panfrost_drv.c > index 746eb4603bc2..186d5db892a9 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c > @@ -270,6 +270,10 @@ static int panfrost_ioctl_mmap_bo(struct > drm_device *dev, void *data, > return -ENOENT; > } > > + /* Don't allow mmapping of heap objects as pages are not pinned. */ > + if (to_panfrost_bo(gem_obj)->is_heap)) > + return -EINVAL; > + > ret = drm_gem_create_mmap_offset(gem_obj); > if (ret == 0) > args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node); > Seems reasonable to me - we can always add support for mmap()ing at a later date if it becomes useful. >> Also shmem->pages_use_count needs incrementing to stop >> drm_gem_shmem_get_pages() replacing bo->base.pages. I haven't tested >> what happens if you mmap *after* the first fault. > > I set pages_use_count to 1 when we allocate pages on the first fault. > Or do you mean we need to set it on creation in case > drm_gem_shmem_get_pages() is called before any GPU faults? > > Either way, that just shifts how/where we crash I think. We need to > prevent drm_gem_shmem_get_pages() from being called. Besides mmap, the > other cases are vmap and exporting. I don't think we have any paths > that will cause vmap to get called in our case. For exporting, perhaps > we need a wrapper around drm_gem_shmem_pin() to prevent it. Yes, either every path to drm_gem_shmem_get_pages() needs to be blocked for HEAP objects, or you need to set pages_use_count to 1 when you allocate (which then means drm_gem_shmem_get_pages() will simply increment the ref-count). Of course if you are leaving any calls to drm_gem_shmem_get_pages() reachable then you also need to ensure that the code that follows understands how to deal with a sparse bo->pages array. Exporting would be a good example - and again I suspect just preventing it is fine for now. Steve _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel