Thank you. Sorry, I already pushed it with Alex's R-B, without yours. On 16-08-16 03:53 AM, Christian König wrote: > Am 15.08.2016 um 23:03 schrieb Alex Deucher: >> On Mon, Aug 15, 2016 at 3:06 PM, Felix Kuehling >> <felix.kuehling at amd.com> wrote: >>> Patch against current amd-staging-4.6 is attached. >> Reviewed-by: Alex Deucher <alexander.deucher at amd.com> > > Reviewed-by: Christian König <christian.koenig at amd.com>. > >> >>> Regards, >>> Felix >>> >>> >>> On 16-08-13 05:25 AM, Christian König wrote: >>>> Am 13.08.2016 um 01:22 schrieb Felix Kuehling: >>>>> [CC Kent FYI] >>>>> >>>>> On 16-08-11 04:31 PM, Deucher, Alexander wrote: >>>>>>> -----Original Message----- >>>>>>> From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On >>>>>>> Behalf >>>>>>> Of Felix Kuehling >>>>>>> Sent: Thursday, August 11, 2016 3:52 PM >>>>>>> To: Michel Dänzer; Christian König >>>>>>> Cc: amd-gfx at lists.freedesktop.org >>>>>>> Subject: Reverted another change to fix buffer move hangs (was Re: >>>>>>> [PATCH] drm/ttm: partial revert "cleanup >>>>>>> ttm_tt_(unbind|destroy)" v2) >>>>>>> >>>>>>> We had to revert another change on the KFD branch to fix a >>>>>>> buffer move >>>>>>> problem: 8b6b79f43801f00ddcdc10a4d5719eba4b2e32aa (drm/amdgpu: >>>>>>> group BOs >>>>>>> by log2 of the size on the LRU v2 >>>>>> That makes sense. I think you may want a different LRU scheme for >>>>>> KFD or at least special handling for KFD buffers. >>>>> [FK] But I think the patch shouldn't cause hangs, regardless. >>>>> >>>>> I eventually found what the problem was. The "group BOs by log2 of >>>>> the >>>>> size on the LRU v2" patch exposed a latent bug related to the GART >>>>> size. >>>>> On our KFD branch, we calculate the GART size differently, and it can >>>>> easily go above 4GB. I think on amd-staging-4.6 the GART size can >>>>> also >>>>> go above 4GB on cards with lots of VRAM. >>>>> >>>>> However, the offset parameter in amdgpu_gart_bind and unbind is only >>>>> 32-bit. With the patch our test ended up using GART offsets beyond >>>>> 4GB >>>>> for the first time. Changing the offset parameter to uint64_t >>>>> fixes the >>>>> problem. >>>> Nice catch, please provide a patch to fix this. >>>> >>>>> Our test also demonstrates a potential flaw in the log2 grouping >>>>> patch: >>>>> When a buffer of a previously unused size is added to the LRU, it >>>>> gets >>>>> added to the front of the list, rather than the tail. So an >>>>> application >>>>> that allocates a very large buffer after a bunch of smaller >>>>> buffers, is >>>>> very likely to have that buffer evicted over and over again before >>>>> any >>>>> smaller buffers are considered for eviction. I believe, this can >>>>> result >>>>> in thrashing of large buffers. >>>>> >>>>> Some other observations: When the last BO of a given size is removed >>>>> from the LRU list, the LRU tail for that size is left "floating" >>>>> in the >>>>> middle of the LRU list. So the next BO of that size that is added, >>>>> will >>>>> be added at an arbitrary position in the list. It may even end up >>>>> in the >>>>> middle of a block of pages of a different size. So a log2 grouping >>>>> may >>>>> end up being split. >>>> Yeah, those are more or less known issues. >>>> >>>> Keep in mind that we only added the grouping by log2 of the size to >>>> have a justification to push the TTM changes upstream for the coming >>>> KFD fences. >>>> >>>> E.g. so that we are able to have this upstream before we try to push >>>> on the fence code. >>>> >>>> I will take a look at fixing those issues when I have time, shouldn't >>>> be to complicated to set the entries to zero when they aren't used or >>>> adjust other entries as well when some are added. >>>> >>>> Regards, >>>> Christian. >>>> >>>>> Regards, >>>>> Felix >>>>> >>>>> _______________________________________________ >>>>> amd-gfx mailing list >>>>> amd-gfx at lists.freedesktop.org >>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>> >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> >