Am 06.01.21 um 18:04 schrieb Joshua Ashton:
On 1/6/21 2:59 PM, Christian König wrote:
Am 06.01.21 um 15:18 schrieb Joshua Ashton:
[SNIP]
For Vulkan we (both RADV and AMDVLK) use GTT as the total size.
Usage in modern games is essentially "bindless" so there is no
way to track at a per-submission level what memory needs to be
resident. (and even with tracking applications are allowed to
use all the memory in a single draw call, which would be
unsplittable anyway ...)
Yeah, that is a really good point.
The issue is that we need some limitation since 3/4 of system
memory is way to much and the max texture size test in piglit can
cause a system crash.
The alternative is a better OOM handling, so that an application
which uses to much system memory through the driver stack has a
more likely chance to get killed. Cause currently that is either
X or Wayland :(
Christian.
As I understand it, what is being exposed right now is essentially
max(vram size, 3GiB) limited by 3/4ths of the memory. Previously,
before the revert what was being taken was just max(3GiB, 3/4ths).
If you had < 3GiB of system memory that seems like a bit of an
issue that could easily leat to OOM to me?
Not really, as I said GTT is only the memory the GPU can lock at
the same time. It is perfectly possible to have that larger than
the available system memory.
In other words this is *not* to prevent using to much system
memory, for this we have an additional limit inside TTM. But
instead to have a reasonable limit for applications to not use to
much memory at the same time.
Worth noting that this GTT size here also affects the memory
reporting and budgeting for applications. If the user has 1GiB of
total system memory and 3GiB set here, then 3GiB will be the budget
and size exposed to applications too...
Yeah, that's indeed problematic.
(On APUs,) we really don't want to expose more GTT than system
memory. Apps will eat into it and end up swapping or running into
OOM or swapping *very* quickly. (I imagine this is likely what was
being run into before the revert.)
No, the issue is that some applications try to allocate textures way
above some reasonable limit.
Alternatively, in RADV and other user space drivers like AMDVLK, we
could limit this to the system memory size or 3/4ths ourselves.
Although that's kinda gross and I don't think that's the correct
path...
Ok, let me explain from the other side: We have this limitation
because otherwise some tests like the maximum texture size test for
OpenGL crashes the system. And this is independent of your system
configuration.
We could of course add another limit for the texture size in
OpenGL/RADV/AMDVLK, but I agree that this is rather awkward.
Are you hitting on something smaller than 3/4ths right now? I
remember the source commit mentioned they only had 1GiB of system
memory available, so that could be possible if you had a carveout
of < 786MiB...
What do you mean with that? I don't have a test system at hand for
this if that's what you are asking for.
This was mainly a question to whoever did the revert. The question
to find out some extra info about what they are using at the time.
You don't need a specific system configuration for this, just try to
run the max texture size test in piglit.
Regards,
Christian.
I see... I have not managed to reproduce a hang as described in the
revert commit, but I have had a soft crash and delay with the OOM
killer ending X.org after a little bit when GTT > system memory.
I tested with max-texture-size on both Renoir and Picasso the
following conditions:
16GiB RAM + 12 GiB GTT -> test works fine
16GiB RAM + 64 GiB GTT -> OOM killer kills X.org after a little bit of
waiting (piglit died with it)
2 GiB RAM + 1.5GiB GTT -> test works fine
I also tested on my Radeon VII and it worked fine regardless of the
GTT size there, although that card has more than enough video memory
any way for nothing to be an issue there 🐸.
Limiting my system memory to 2GiB, the card's memory and visible
memory to 1GiB and the GTT to 1.75GiB, the test works fine.
The only time I ever had problems with a crash or pesudo-hang (waiting
for OOM killer but the system was locked up) was whenever GTT was >
system memory (ie. in the reverted commit)
If I edited my commit to universally use 3/4ths of the system memory
for GTT for all hardware, would that be considered to be merged?
Well maybe 1/2 and only on APUs. And you need to find somebody with
another Raven to test that. Maybe Nirmoy has time for this.
Regards,
Christian.
Thanks!
- Joshie 🐸✨
- Joshie 🐸✨
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx