Re: [PATCH] ttm: wait mem space if user allow while gpu busy

Daniel Vetter <daniel@xxxxxxxx> · Wed, 24 Apr 2019 10:01:07 +0200

On Tue, Apr 23, 2019 at 4:42 PM Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Well that's not so easy of hand.
>
> The basic problem here is that when you busy wait at this place you can easily run into situations where application A busy waits for B while B busy waits for A -> deadlock.
>
> So what we need here is the deadlock detection logic of the ww_mutex. To use this we at least need to do the following steps:
>
> 1. Reserve the BO in DC using a ww_mutex ticket (trivial).
>
> 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned.
>
> 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket.
>
> 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved).

I don't think you actually need this check. Once you're in this slow
reclaim mode all hope for performance is pretty much lost (you're
thrashin vram terribly), forward progress matters. Also, less code :-)

> 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO.
>
> 6. If any of the "If's" above fail we just back off and return -EBUSY.

Another idea I pondered (but never implemented) is a slow reclaim lru
lock. Essentially there'd be two ways to walk the lru and evict bo:

- fast path: spinlock + trylock, like today

- slow path: ww_mutex lru lock, plus every bo is reserved (nested
within that ww_mutex lru lock) with a full ww_mutex_lock. Guaranteed
forward progress.

Transition would be that as soon as someone hits an EBUSY, they set
the slow reclaim flag (while holding the quick reclaim spinlock
quickly, which will drain anyone still stuck in fast reclaim path).
Everytime fast reclaim acquires the spinlock it needs to check for the
slow reclaim flag, and if that's set, fall back to slow reclaim.

Transitioning out of slow reclaim would only happen once the thread
(with it's ww ticket) that hit the EBUSY has completed whatever it was
trying to do (either successfully, or failed because even evicting
everyone else didn't give you enough vram). Tricky part here is making
sure threads still in slow reclaim don't blow up if we switch back.
Since only ever one thread can be actually doing slow reclaim
(everyone is serialized on the single ww mutex lru lock) should be
doable by checking for the slow reclaim conditiona once you have the
lru ww_mutex and if the slow reclaim condition is lifted, switch back
to fast reclaim.

The slow reclaim conditiona might also need to be a full reference
count, to handle multiple threads hitting EBUSY/slow reclaim without
the book-keeping getting all confused.

Upshot of this is that it's guranteeing forward progress, but the perf
cliff might be too steep if this happens too often. You might need to
round it off with 1-2 retries when you hit EBUSY, before forcing slow
reclaim.
-Daniel

> Steps 2-5 are certainly not trivial, but doable as far as I can see.
>
> Have fun :)
> Christian.
>
> Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
>
> How about adding more condition ctx->resv inline to address your concern? As well as don't wait from same user, shouldn't lead to deadlock.
>
> Otherwise, any other idea?
>
> -------- Original Message --------
> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
> From: Christian König
> To: "Liang, Prike" ,"Zhou, David(ChunMing)" ,dri-devel@xxxxxxxxxxxxxxxxxxxxx
> CC:
>
> Well that is certainly a NAK because it can lead to deadlock in the
> memory management.
>
> You can't just busy wait with all those locks held.
>
> Regards,
> Christian.
>
> Am 23.04.19 um 03:45 schrieb Liang, Prike:
> > Acked-by: Prike Liang <Prike.Liang@xxxxxxx>
> >
> > Thanks,
> > Prike
> > -----Original Message-----
> > From: Chunming Zhou <david1.zhou@xxxxxxx>
> > Sent: Monday, April 22, 2019 6:39 PM
> > To: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > Cc: Liang, Prike <Prike.Liang@xxxxxxx>; Zhou, David(ChunMing) <David1.Zhou@xxxxxxx>
> > Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
> >
> > heavy gpu job could occupy memory long time, which could lead to other user fail to get memory.
> >
> > Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
> > Signed-off-by: Chunming Zhou <david1.zhou@xxxxxxx>
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
> >   1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct ttm_buffer_object *bo,
> >                if (mem->mm_node)
> >                        break;
> >                ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
> > -             if (unlikely(ret != 0))
> > -                     return ret;
> > +             if (unlikely(ret != 0)) {
> > +                     if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
> > +                             return ret;
> > +             }
> >        } while (1);
> >        mem->mem_type = mem_type;
> >        return ttm_bo_add_move_fence(bo, man, mem);
> > --
> > 2.17.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel