Plan: BO move throttling for visible VRAM evictions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On May 15, 2017 6:40 AM, "zhoucm1" <david1.zhou at amd.com> wrote:



On 2017å¹´05æ??14æ?¥ 05:31, Marek Olšák wrote:

> On Mon, Apr 17, 2017 at 11:55 AM, Michel Dänzer <michel at daenzer.net>
> wrote:
>
>> On 17/04/17 07:58 AM, Marek Olšák wrote:
>>
>>> On Fri, Apr 14, 2017 at 12:14 PM, Michel Dänzer <michel at daenzer.net>
>>> wrote:
>>>
>>>> On 04/04/17 05:11 AM, Marek Olšák wrote:
>>>>
>>>>> On Fri, Mar 31, 2017 at 5:24 AM, Michel Dänzer <michel at daenzer.net>
>>>>> wrote:
>>>>>
>>>>>> On 30/03/17 07:03 PM, Michel Dänzer wrote:
>>>>>>
>>>>>>> On 25/03/17 01:33 AM, Marek Olšák wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm sharing this idea here, because it's something that has been
>>>>>>>> decreasing our performance a lot recently, for example:
>>>>>>>> http://openbenchmarking.org/prospect/1703011-RI-RADEONDIR06/
>>>>>>>> 7b7668cfc109d1c3dc27e871c8aea71ca13f23fa
>>>>>>>>
>>>>>>> The attached proof-of-concept patch (on top of Christian's "CPU
>>>>>>> mapping
>>>>>>> of split VRAM buffers" series, ported from radeon) results in 145.05
>>>>>>> fps
>>>>>>> on my Tonga.
>>>>>>>
>>>>>> I get the same result without my or Christian's patches though, with
>>>>>> 4.11 based DRM or amd-staging-4.9. So I guess I just can't reproduce
>>>>>> the
>>>>>> problem with this test. Are there any other tests for it?
>>>>>>
>>>>> It's random. Sometimes the benchmark runs OK, other times it's slow.
>>>>> You can easily see the difference but observing how smooth it is. The
>>>>> visible VRAM evictions result in constant 100-200ms stalls but not
>>>>> every frame, which feels like the frame rate is much lower than it
>>>>> actually is.
>>>>>
>>>>> Make sure your graphics details are maxed out. The best score I can
>>>>> get with my rig is 70 fps. (Fiji & Core i5 3570)
>>>>>
>>>> I'm getting around 53-54 fps at Ultra with Tonga, both with Mesa 13.0.6
>>>> and Git.
>>>>
>>>> Have you tried if Christian's patches for CPU access to split VRAM
>>>> buffers help? I can imagine that forcing contiguous VRAM buffers for CPU
>>>> access could cause lots of other BOs to be unnecessarily evicted from
>>>> VRAM, if at least one of their fragments happens to be in the CPU
>>>> visible part of VRAM.
>>>>
>>> I've finally tested latest amd-staging-4.9 and I'm very pleased. For
>>> the first time, the Deus Ex benchmark has almost no hiccups. I've
>>> never seen it so smooth. At one point, the MB/s BO move rate increase
>>> to 200MB/s, stayed there for a couple of seconds, and then it dropped
>>> to 0 again. The frame rate was OK-ish, so I guess the moves didn't
>>> happen all at once. I also tested DiRT Rally and I haven't been able
>>> to reproduce the low FPS with the consistently-high BO move rate that
>>> I saw several months ago.
>>>
>>> We could do some move throttling there for sure, but it's much better
>>> than it ever was.
>>>
>> That's great to hear. If you get a chance, it would be interesting if
>> the attached updated patch improves things even more for you. (The patch
>> I attached previously couldn't work as intended, this one at least might
>> :)
>>
> Frogging101 on IRC noticed that we get a ton of TTM BO moves due to
> visible VRAM thrashing and Michel's patch doesn't help. His kernel is
> up to date with amd-staging. It looks like the only option left is my
> original plan: BO move throttling for visible VRAM by redirecting
> mapped buffers to GTT and not allowing them to go back to VRAM if some
> counter is too high.
>
I agree on this opinion, from our performance tuning experiment, this case
indeed often happen especially under vram memory pressure. redirecting to
GTT is better than heavy eviction between VRAM and GTT.
But we should get a condition for redirecting (eviction counter?),
otherwise, BO have no change back to prefer domain.


You're talking about something different. VRAM memory pressure is a solved
problem in amdgpu. There is even a kernel parameter that controls the
amount of buffer moves between VRAM and GTT. So you can control the amount
of evictions today.

This discussion is about evictions between visible VRAM and invisible VRAM.

Marek


Regards,
David Zhou

>
> Opinions?
>
> Marek
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170515/a6dd1bde/attachment.html>


[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux