On 30/04/2024 12:28, Boris Brezillon wrote: > From: Antonino Maniscalco <antonino.maniscalco@xxxxxxxxxxxxx> > > If the kernel couldn't allocate memory because we reached the maximum > number of chunks but no render passes are in flight > (panthor_heap_grow() returning -ENOMEM), we should defer the OOM > handling to the FW by returning a NULL chunk. The FW will then call > the tiler OOM exception handler, which is supposed to implement > incremental rendering (execute an intermediate fragment job to flush > the pending primitives, release the tiler memory that was used to > store those primitives, and start over from where it stopped). > > Instead of checking for both ENOMEM and EBUSY, make panthor_heap_grow() > return ENOMEM no matter the reason of this allocation failure, the FW > doesn't care anyway. > > v2: > - Make panthor_heap_grow() return -ENOMEM for all kind of allocation > failures > - Document the panthor_heap_grow() semantics > > Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") > Signed-off-by: Antonino Maniscalco <antonino.maniscalco@xxxxxxxxxxxxx> > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> Reviewed-by: Steven Price <steven.price@xxxxxxx> Thanks, Steve > --- > drivers/gpu/drm/panthor/panthor_heap.c | 12 ++++++++---- > drivers/gpu/drm/panthor/panthor_sched.c | 7 ++++++- > 2 files changed, 14 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c > index 143fa35f2e74..c3c0ba744937 100644 > --- a/drivers/gpu/drm/panthor/panthor_heap.c > +++ b/drivers/gpu/drm/panthor/panthor_heap.c > @@ -410,6 +410,13 @@ int panthor_heap_return_chunk(struct panthor_heap_pool *pool, > * @renderpasses_in_flight: Number of render passes currently in-flight. > * @pending_frag_count: Number of fragment jobs waiting for execution/completion. > * @new_chunk_gpu_va: Pointer used to return the chunk VA. > + * > + * Return: > + * - 0 if a new heap was allocated > + * - -ENOMEM if the tiler context reached the maximum number of chunks > + * or if too many render passes are in-flight > + * or if the allocation failed > + * - -EINVAL if any of the arguments passed to panthor_heap_grow() is invalid > */ > int panthor_heap_grow(struct panthor_heap_pool *pool, > u64 heap_gpu_va, > @@ -439,10 +446,7 @@ int panthor_heap_grow(struct panthor_heap_pool *pool, > * handler provided by the userspace driver, if any). > */ > if (renderpasses_in_flight > heap->target_in_flight || > - (pending_frag_count > 0 && heap->chunk_count >= heap->max_chunks)) { > - ret = -EBUSY; > - goto out_unlock; > - } else if (heap->chunk_count >= heap->max_chunks) { > + heap->chunk_count >= heap->max_chunks) { > ret = -ENOMEM; > goto out_unlock; > } > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c > index b3a51a6de523..fd928362d45e 100644 > --- a/drivers/gpu/drm/panthor/panthor_sched.c > +++ b/drivers/gpu/drm/panthor/panthor_sched.c > @@ -1354,7 +1354,12 @@ static int group_process_tiler_oom(struct panthor_group *group, u32 cs_id) > pending_frag_count, &new_chunk_va); > } > > - if (ret && ret != -EBUSY) { > + /* If the heap context doesn't have memory for us, we want to let the > + * FW try to reclaim memory by waiting for fragment jobs to land or by > + * executing the tiler OOM exception handler, which is supposed to > + * implement incremental rendering. > + */ > + if (ret && ret != -ENOMEM) { > drm_warn(&ptdev->base, "Failed to extend the tiler heap\n"); > group->fatal_queues |= BIT(cs_id); > sched_queue_delayed_work(sched, tick, 0);