[PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order

zhoucm1@xxxxxxx (Chunming Zhou) · Wed, 31 Jan 2018 18:30:59 +0800

On 2018å¹´01æ??26æ?¥ 22:35, Christian KÃ¶nig wrote:
> I just realized that a change I'm thinking about for a while would 
> solve your problem as well, but keep concurrent allocation possible.
>
> See ttm_mem_evict_first() unlocks the BO after evicting it:
>> Â Â Â Â Â Â Â  ttm_bo_del_from_lru(bo);
>> Â Â Â Â Â Â Â  spin_unlock(&glob->lru_lock);
>>
>> Â Â Â Â Â Â Â  ret = ttm_bo_evict(bo, ctx);
>> Â Â Â Â Â Â Â  if (locked) {
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  ttm_bo_unreserve(bo); <-------- here
>> Â Â Â Â Â Â Â  } else {
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  spin_lock(&glob->lru_lock);
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  ttm_bo_add_to_lru(bo);
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  spin_unlock(&glob->lru_lock);
>> Â Â Â Â Â Â Â  }
>>
>> Â Â Â Â Â Â Â  kref_put(&bo->list_kref, ttm_bo_release_list);
>
> The effect is that in your example process C can not only beat process 
> B once, but many many times because we run into a ping/pong situation 
> where B evicts resources while C moves them back in.
For ping/pong case, I want to disable busy placement for allocation 
period, only enable it for cs bo validation.

>
> For a while now I'm thinking about dropping those reservations only 
> after the original allocation succeeded.
>
> The effect would be that process C can still beat process B initially, 
> but sooner or process B would evict some resources from process C as 
> well and then it can succeed with its allocation.
If it is from process C cs validation, process B still need evict the 
resource only after process C command submission completion.

>
> The problem is for this approach to work we need to core change to the 
> ww_mutexes to be able to handle this efficiently.
Yes, ww_mutex doesn't support this net lock, which easily deadlock 
without ticket and class.

So I think preventing validation on same place is a simpler way:
process B bo's place is fpfn~lpfn, it will only try to evict LRU BOs in 
that range, while eviction, we just prevent those validation to this 
range(fpfn~lpfn), if out of this range, the allocation/validation still 
can be go on.

Any negative?

Regards,
David Zhou
>
> Regards,
> Christian.
>
> Am 26.01.2018 um 14:59 schrieb Christian KÃ¶nig:
>> I know, but this has the same effect. You prevent concurrent 
>> allocation from happening.
>>
>> What we could do is to pipeline reusing of deleted memory as well, 
>> this makes it less likely to cause the problem you are seeing because 
>> the evicting processes doesn't need to block for deleted BOs any more.
>>
>> But that other processes can grab memory during eviction is 
>> intentional. Otherwise greedy processes would completely dominate 
>> command submission.
>>
>> Regards,
>> Christian.
>>
>> Am 26.01.2018 um 14:50 schrieb Zhou, David(ChunMing):
>>> I don't want to prevent all, my new approach is to prevent the later 
>>> allocation is trying and ahead of front to get the memory space that 
>>> the front made from eviction.
>>>
>>>
>>> å??è?ªå??æ?? Pro
>>>
>>> Christian Ké°?ig <ckoenig.leichtzumerken at gmail.com> äº? 2018å¹´1æ??26æ?¥ 
>>> ä¸?å??9:24å??é??ï¼?
>>>
>>> Yes, exactly that's the problem.
>>>
>>> See when you want to prevent a process B from allocating the memory 
>>> process A has evicted, you need to prevent all concurrent allocation.
>>>
>>> And we don't do that because it causes a major performance drop.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 26.01.2018 um 14:21 schrieb Zhou, David(ChunMing):
>>>> You patch will prevent concurrent allocation, and will result in 
>>>> allocation performance drop much.
>>>>
>>>> å??è?ªå??æ?? Pro
>>>>
>>>> Christian Ké°?ig <ckoenig.leichtzumerken at gmail.com> äº? 2018å¹´1æ??26æ?¥ 
>>>> ä¸?å??9:04å??é??ï¼?
>>>>
>>>> Attached is what you actually want to do cleanly implemented. But 
>>>> as I said this is a NO-GO.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 26.01.2018 um 13:43 schrieb Christian KÃ¶nig:
>>>>>> After my investigation, this issue should be detect of TTM design 
>>>>>> self, which breaks scheduling balance.
>>>>> Yeah, but again. This is indented design we can't change easily.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):
>>>>>> I am off work, so reply mail by phone, the format could not be text.
>>>>>>
>>>>>> back to topic itself:
>>>>>> the problem indeed happen on amdgpu driver, someone reports me 
>>>>>> that application runs with two instances, the performance are 
>>>>>> different.
>>>>>> I also reproduced the issue with unit test(bo_eviction_test). 
>>>>>> They always think our scheduler isn't working as expected.
>>>>>>
>>>>>> After my investigation, this issue should be detect of TTM design 
>>>>>> self, which breaks scheduling balance.
>>>>>>
>>>>>> Further, if we run containers for our gpu, container A could run 
>>>>>> high score, container B runs low score with same benchmark.
>>>>>>
>>>>>> So this is bug that we need fix.
>>>>>>
>>>>>> Regards,
>>>>>> David Zhou
>>>>>>
>>>>>> å??è?ªå??æ?? Pro
>>>>>>
>>>>>> Christian Ké°?ig <ckoenig.leichtzumerken at gmail.com> äº? 2018å¹´1æ??26æ?¥ 
>>>>>> ä¸?å??6:31å??é??ï¼?
>>>>>>
>>>>>> Am 26.01.2018 um 11:22 schrieb Chunming Zhou:
>>>>>> > there is a scheduling balance issue about get node like:
>>>>>> > a. process A allocates full memory and use it for submission.
>>>>>> > b. process B tries to allocates memory, will wait for process A 
>>>>>> BO idle in eviction.
>>>>>> > c. process A completes the job, process B eviction will put 
>>>>>> process A BO node,
>>>>>> > but in the meantime, process C is comming to allocate BO, whill 
>>>>>> directly get node successfully, and do submission,
>>>>>> > process B will again wait for process C BO idle.
>>>>>> > d. repeat the above setps, process B could be delayed much more.
>>>>>> >
>>>>>> > later allocation must not be ahead of front in same place.
>>>>>>
>>>>>> Again NAK to the whole approach.
>>>>>>
>>>>>> At least with amdgpu the problem you described above never occurs
>>>>>> because evictions are pipelined operations. We could only block for
>>>>>> deleted regions to become free.
>>>>>>
>>>>>> But independent of that incoming memory requests while we make 
>>>>>> room for
>>>>>> eviction are intended to be served first.
>>>>>>
>>>>>> Changing that is certainly a no-go cause that would favor memory 
>>>>>> hungry
>>>>>> applications over small clients.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> >
>>>>>> > Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18
>>>>>> > Signed-off-by: Chunming Zhou <david1.zhou at amd.com>
>>>>>> > ---
>>>>>> >Â Â  drivers/gpu/drm/ttm/ttm_bo.cÂ Â Â  | 69 
>>>>>> +++++++++++++++++++++++++++++++++++++++--
>>>>>> >Â Â  include/drm/ttm/ttm_bo_api.hÂ Â Â  |Â  7 +++++
>>>>>> >Â Â  include/drm/ttm/ttm_bo_driver.h |Â  7 +++++
>>>>>> >Â Â  3 files changed, 80 insertions(+), 3 deletions(-)
>>>>>> >
>>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> > index d33a6bb742a1..558ec2cf465d 100644
>>>>>> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> > @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct 
>>>>>> ttm_buffer_object *bo,
>>>>>> >Â Â Â Â Â Â Â  return 0;
>>>>>> >Â Â  }
>>>>>> >
>>>>>> > +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_buffer_object *bo,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  const struct ttm_place *place)
>>>>>> > +{
>>>>>> > +Â Â Â Â  waiter->tbo = bo;
>>>>>> > +Â Â Â Â  memcpy((void *)&waiter->place, (void *)place, 
>>>>>> sizeof(*place));
>>>>>> > + INIT_LIST_HEAD(&waiter->list);
>>>>>> > +}
>>>>>> > +
>>>>>> > +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_bo_waiter *waiter)
>>>>>> > +{
>>>>>> > +Â Â Â Â  if (!waiter)
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  return;
>>>>>> > +Â Â Â Â  spin_lock(&man->wait_lock);
>>>>>> > +Â Â Â Â  list_add_tail(&waiter->list, &man->waiter_list);
>>>>>> > + spin_unlock(&man->wait_lock);
>>>>>> > +}
>>>>>> > +
>>>>>> > +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_bo_waiter *waiter)
>>>>>> > +{
>>>>>> > +Â Â Â Â  if (!waiter)
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  return;
>>>>>> > +Â Â Â Â  spin_lock(&man->wait_lock);
>>>>>> > +Â Â Â Â  if (!list_empty(&waiter->list))
>>>>>> > + list_del(&waiter->list);
>>>>>> > + spin_unlock(&man->wait_lock);
>>>>>> > +Â Â Â Â  kfree(waiter);
>>>>>> > +}
>>>>>> > +
>>>>>> > +int ttm_man_check_bo(struct ttm_mem_type_manager *man,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_buffer_object *bo,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  const struct ttm_place *place)
>>>>>> > +{
>>>>>> > +Â Â Â Â  struct ttm_bo_waiter *waiter, *tmp;
>>>>>> > +
>>>>>> > +Â Â Â Â  spin_lock(&man->wait_lock);
>>>>>> > +Â Â Â Â  list_for_each_entry_safe(waiter, tmp, &man->waiter_list, 
>>>>>> list) {
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  if ((bo != waiter->tbo) &&
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  ((place->fpfn >= waiter->place.fpfn &&
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  place->fpfn <= waiter->place.lpfn) ||
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  (place->lpfn <= waiter->place.lpfn && 
>>>>>> place->lpfn >=
>>>>>> > + waiter->place.fpfn)))
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  goto later_bo;
>>>>>> > +Â Â Â Â  }
>>>>>> > + spin_unlock(&man->wait_lock);
>>>>>> > +Â Â Â Â  return true;
>>>>>> > +later_bo:
>>>>>> > + spin_unlock(&man->wait_lock);
>>>>>> > +Â Â Â Â  return false;
>>>>>> > +}
>>>>>> >Â Â  /**
>>>>>> >Â Â Â  * Repeatedly evict memory from the LRU for @mem_type until 
>>>>>> we create enough
>>>>>> >Â Â Â  * space, or we've evicted everything and there isn't enough 
>>>>>> space.
>>>>>> > @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(struct 
>>>>>> ttm_buffer_object *bo,
>>>>>> >Â Â  {
>>>>>> >Â Â Â Â Â Â Â  struct ttm_bo_device *bdev = bo->bdev;
>>>>>> >Â Â Â Â Â Â Â  struct ttm_mem_type_manager *man = &bdev->man[mem_type];
>>>>>> > +Â Â Â Â  struct ttm_bo_waiter waiter;
>>>>>> >Â Â Â Â Â Â Â  int ret;
>>>>>> >
>>>>>> > +Â Â Â Â  ttm_man_init_waiter(&waiter, bo, place);
>>>>>> > +Â Â Â Â  ttm_man_add_waiter(man, &waiter);
>>>>>> >Â Â Â Â Â Â Â  do {
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  ret = (*man->func->get_node)(man, bo, place, mem);
>>>>>> > -Â Â Â Â Â Â Â Â Â Â Â Â  if (unlikely(ret != 0))
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  if (unlikely(ret != 0)) {
>>>>>> > + ttm_man_del_waiter(man, &waiter);
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  return ret;
>>>>>> > -Â Â Â Â Â Â Â Â Â Â Â Â  if (mem->mm_node)
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  }
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  if (mem->mm_node) {
>>>>>> > + ttm_man_del_waiter(man, &waiter);
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  break;
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  }
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  ret = ttm_mem_evict_first(bdev, mem_type, place, 
>>>>>> ctx);
>>>>>> > -Â Â Â Â Â Â Â Â Â Â Â Â  if (unlikely(ret != 0))
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  if (unlikely(ret != 0)) {
>>>>>> > + ttm_man_del_waiter(man, &waiter);
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  return ret;
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â  }
>>>>>> >Â Â Â Â Â Â Â  } while (1);
>>>>>> >Â Â Â Â Â Â Â  mem->mem_type = mem_type;
>>>>>> >Â Â Â Â Â Â Â  return ttm_bo_add_move_fence(bo, man, mem);
>>>>>> > @@ -1450,6 +1511,8 @@ int ttm_bo_init_mm(struct ttm_bo_device 
>>>>>> *bdev, unsigned type,
>>>>>> >Â Â Â Â Â Â Â  man->use_io_reserve_lru = false;
>>>>>> > mutex_init(&man->io_reserve_mutex);
>>>>>> > spin_lock_init(&man->move_lock);
>>>>>> > + spin_lock_init(&man->wait_lock);
>>>>>> > + INIT_LIST_HEAD(&man->waiter_list);
>>>>>> > INIT_LIST_HEAD(&man->io_reserve_lru);
>>>>>> >
>>>>>> >Â Â Â Â Â Â Â  ret = bdev->driver->init_mem_type(bdev, type, man);
>>>>>> > diff --git a/include/drm/ttm/ttm_bo_api.h 
>>>>>> b/include/drm/ttm/ttm_bo_api.h
>>>>>> > index 2cd025c2abe7..0fce4dbd02e7 100644
>>>>>> > --- a/include/drm/ttm/ttm_bo_api.h
>>>>>> > +++ b/include/drm/ttm/ttm_bo_api.h
>>>>>> > @@ -40,6 +40,7 @@
>>>>>> >Â Â  #include <linux/mm.h>
>>>>>> >Â Â  #include <linux/bitmap.h>
>>>>>> >Â Â  #include <linux/reservation.h>
>>>>>> > +#include <drm/ttm/ttm_placement.h>
>>>>>> >
>>>>>> >Â Â  struct ttm_bo_device;
>>>>>> >
>>>>>> > @@ -232,6 +233,12 @@ struct ttm_buffer_object {
>>>>>> >Â Â Â Â Â Â Â  struct mutex wu_mutex;
>>>>>> >Â Â  };
>>>>>> >
>>>>>> > +struct ttm_bo_waiter {
>>>>>> > +Â Â Â Â  struct ttm_buffer_object *tbo;
>>>>>> > +Â Â Â Â  struct ttm_place place;
>>>>>> > +Â Â Â Â  struct list_head list;
>>>>>> > +};
>>>>>> > +
>>>>>> >Â Â  /**
>>>>>> >Â Â Â  * struct ttm_bo_kmap_obj
>>>>>> >Â Â Â  *
>>>>>> > diff --git a/include/drm/ttm/ttm_bo_driver.h 
>>>>>> b/include/drm/ttm/ttm_bo_driver.h
>>>>>> > index 9b417eb2df20..dc6b8b4c9e06 100644
>>>>>> > --- a/include/drm/ttm/ttm_bo_driver.h
>>>>>> > +++ b/include/drm/ttm/ttm_bo_driver.h
>>>>>> > @@ -293,6 +293,10 @@ struct ttm_mem_type_manager {
>>>>>> >Â Â Â Â Â Â Â  bool io_reserve_fastpath;
>>>>>> >Â Â Â Â Â Â Â  spinlock_t move_lock;
>>>>>> >
>>>>>> > +Â Â Â Â  /* waiters in list */
>>>>>> > +Â Â Â Â  spinlock_t wait_lock;
>>>>>> > +Â Â Â Â  struct list_head waiter_list;
>>>>>> > +
>>>>>> >Â Â Â Â Â Â Â  /*
>>>>>> >Â Â Â Â Â Â Â Â  * Protected by @io_reserve_mutex:
>>>>>> >Â Â Â Â Â Â Â Â  */
>>>>>> > @@ -748,6 +752,9 @@ int ttm_bo_mem_space(struct 
>>>>>> ttm_buffer_object *bo,
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_mem_reg *mem,
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_operation_ctx *ctx);
>>>>>> >
>>>>>> > +int ttm_man_check_bo(struct ttm_mem_type_manager *man,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_buffer_object *bo,
>>>>>> > +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  const struct ttm_place *place);
>>>>>> >Â Â  void ttm_bo_mem_put(struct ttm_buffer_object *bo, struct 
>>>>>> ttm_mem_reg *mem);
>>>>>> >Â Â  void ttm_bo_mem_put_locked(struct ttm_buffer_object *bo,
>>>>>> >Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct ttm_mem_reg *mem);
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180131/39928eb3/attachment-0001.html>