Random short freezes due to TTM buffer migrations

alexdeucher@xxxxxxxxx (Alex Deucher) · Tue, 16 Aug 2016 11:50:55 -0400

On Tue, Aug 16, 2016 at 11:27 AM, Christian KÃ¶nig
<deathsimple at vodafone.de> wrote:
> Hi Marek,
>
> I'm already working on this.
>
> My current approach is to use a custom BO manager for VRAM with TTM and so
> split allocations into chunks of 4MB.
>

How about handling vram in fragment size pages (64k) or does the
overhead get too high?  Or add logic to migrate to contiguous for
scanout or for engines without vm support.  We could reserve part of
the gart aperture for driver use for paged vram migration handling.

Alex

> Large BOs are still swapped out as one, but it makes it much more likely to
> that you can allocate 1/2 of VRAM as one buffer.
>
> Give me till the end of the week to finish this and then we can test if
> that's sufficient or if we need to do more.
>
> Regards,
> Christian.
>
>
> Am 16.08.2016 um 16:33 schrieb Marek OlÅ¡Ã¡k:
>>
>> Hi,
>>
>> I'm seeing random temporary freezes (up to 2 seconds) under memory
>> pressure. Before I describe the exact circumstances, I'd like to say
>> that this is a serious issue affecting playability of certain AAA
>> Linux games.
>>
>> In order to reproduce this, an application should:
>> - allocate a few very large buffers (256-512 MB per buffer)
>> - allocate more memory than there is available VRAM. The issue also
>> occurs (but at a lower frequency) if the app needs only 80% of VRAM.
>>
>> Example: ttm_bo_validate needs to migrate a 512 MB buffer. The total
>> size of moved memory for that call can be as high as 1.5 GB. This is
>> always followed by a big temporary drop in VRAM usage.
>>
>> The game I'm testing needs 3.4 GB of VRAM.
>>
>> Setups:
>> Tonga - 2 GB: It's nearly unplayable, because freezes occur too often.
>> Fiji - 4 GB: There is one freeze at the beginning (which is annoying
>> too), after that it's smooth.
>>
>> So even 4 GB is not enough.
>>
>> Workarounds:
>> - Split buffers into smaller pieces in the kernel. It's not necessary
>> to manage memory at page granularity (64KB). Splitting buffers into
>> 16MB-large pieces might not be optimal but it would be a significant
>> improvement.
>> - Or do the same in Mesa. This would prevent inter-process and
>> inter-API buffer sharing for split buffers (DRI, OpenCL), but we would
>> at least verify how much the situation improves.
>>
>> Other issues sharing the same cause:
>> - Allocations requesting 1/3 or more VRAM have a high chance of
>> failing. It's generally not possible to allocate 1/2 or more VRAM as
>> one buffer.
>>
>> Comments welcome,
>>
>> Marek
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx