Am 04.12.19 um 14:18 schrieb Thomas Hellström (VMware):
On 12/4/19 1:16 PM, Christian König wrote:
Am 04.12.19 um 12:45 schrieb Thomas Hellström (VMware):
On 12/4/19 12:13 PM, Christian König wrote:
Am 03.12.19 um 14:22 schrieb Thomas Hellström (VMware):
From: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
Using huge page-table entries require that the start of a buffer
object
is huge page size aligned. So introduce a ttm_bo_man_get_node_huge()
function that attempts to accomplish this for allocations that are
larger
than the huge page size, and provide a new range-manager instance
that
uses that function.
I still don't think that this is a good idea.
Again, can you elaborate with some specific concerns?
You seems to be seeing PUD as something optional.
The driver/userspace should just use a proper alignment if it wants
to use huge pages.
There are drawbacks with this approach. The TTM alignment is a hard
constraint. Assume that you want to fit a 1GB buffer object into
limited VRAM space, and _if possible_ use PUD size huge pages. Let's
say there is 1GB available, but not 1GB aligned. The proper
alignment approach would fail and possibly start to evict stuff from
VRAM just to be able to accomodate the PUD alignment. That's bad.
The approach I suggest would instead fall back to PMD alignment and
use 2MB page table entries if possible, and as a last resort use 4K
page table entries.
And exactly that sounds like a bad idea to me.
Using 1GB alignment is indeed unrealistic in most cases, but for 2MB
alignment we should really start to evict BOs.
Otherwise the address space can become fragmented and we won't be
able de-fragment it in any way.
Ah, I see, Yeah that's the THP tradeoff between fragmentation and
memory-usage. From my point of view, it's not self-evident that either
approach is the best one, but the nice thing with the suggested code
is that you can view it as an optional helper. For example, to avoid
fragmentation and have a high huge-page hit ratio for 2MB pages, You'd
either inflate the buffer object size to be 2MB aligned, which would
affect also system memory, or you'd set the TTM memory alignment to
2MB. If in addition you'd like "soft" (non-evicting) alignment also
for 1GB pages, you'd also hook up the new range manager. I figure
different drivers would want to use different strategies.
In any case, vmwgfx would, due to its very limited VRAM size, want to
use the "soft" alignment provided by this patch, but if you don't see
any other drivers wanting that, I could definitely move it to vmwgfx.
Ok, let's do it this way then. Both amdgpu and well as nouveau have
specialized allocators anyway and I don't see the need for this in radeon.
Regards,
Christian.
/Thomas