Re: [PATCH] Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system"

Arunpravin Paneer Selvam <arunpravin.paneerselvam@xxxxxxx> · Mon, 29 May 2023 12:14:54 +0530

On 5/29/2023 8:05 AM, Xiao, Shane wrote:
[AMD Official Use Only - General]

-----Original Message-----
From: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx>
Sent: Saturday, May 27, 2023 2:12 AM
To: Xiao, Shane <shane.xiao@xxxxxxx>; Kuehling, Felix
<Felix.Kuehling@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>
Subject: Re: [PATCH] Revert "drm/amdgpu: remove TOPDOWN flags when
allocating VRAM in large bar system"



On 5/22/2023 8:52 AM, Xiao, Shane wrote:
[AMD Official Use Only - General]

-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Sent: Sunday, May 21, 2023 2:39 AM
To: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx>;
amd-
gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian
<Christian.Koenig@xxxxxxx>; Xiao, Shane <shane.xiao@xxxxxxx>
Subject: Re: [PATCH] Revert "drm/amdgpu: remove TOPDOWN flags when
allocating VRAM in large bar system"

Am 2023-05-20 um 05:25 schrieb Arunpravin Paneer Selvam:
This reverts commit c105518679b6e87232874ffc989ec403bee59664.

This patch disables the TOPDOWN flag for APU and few dGPU cards
which has the VRAM size equal to the BAR size.
With resizable BARs it's not that rare.


When we enable the TOPDOWN flag, we get the free blocks at the
highest available memory region and we don't split the lower order blocks.
This change is required to keep off the fragmentation related issues
particularly in ASIC which has VRAM space <= 500MiB
As far as I can understand that, both ways may cause fragmentation issues.

As I can understand that remove TOPDOWN flags may get two pros:
1) It can reduce the research time to O(1);
2) Reduce the risk of splitting higher order into lower orders when allocating
blocks.
But as for the second point, Removing TOPDOWN flags still carries the
risk of causing fragmentation issue on some scenarios.
For example, the apps need allocate (1M + 4K) VRAM size, which is not a
power of two.
Case 1: If not using TOPDOWN flag, we will allocate 1M order size and 4K
order size from different order.
--And then if this 4K buddy and 1M buddy are freed, it may prevent the system
from merging 4K(or 1M) to 8K(2M) order.
--If the app has many such allocation requirements, it may cause
fragmentation issue under memory stress.
Case 2: If using TOPDOWN flag, we will find the highest order which is
suitable for 4K and 1M needs.
--Assuming the highest order size is 8M, then we will split this 8M to lower
order to meet the needs of 1M and 4K sizes, respectively.
--In such case the 8M size will split into different lower orders.
--If any other thread or process need allocate 4K or 1M VRAM, and there is
also a chance to prevent the system from merging 4K(or 1M) to 8K(or 2M)
order.
If there are so many needs to allocate not power of two pages from apps,
one of the choose is to use suballocator from UMD such as what ROCr does.
That means suballocator can allocate power of two pages(for example 2M
bytes) from driver, and the app can allocate memory from suballocator.
In such way, it may reduce the risk of causing fragmentation issues.  But I am
not sure that such an option is feasible.
Maybe case 2 occurs less frequently than case one, then we should use
TOPDOWN flags whether the system is lar-bar or not.
Yes, It is better to set TOPDOWN flag mainly to avoid splitting down the lower
order blocks which resolves the fragmentation issues and it doesn't try to
allocate from visible memory (provided that we have the enough requested
memory space in the top down region) which improves the graphics
performances as well.
Hi Arun,
Think it more, the gfx needs get continuous VRAM for display something.
In such case, the TOPDOWN flag can meet this need better than not use TOPDOWN flag because the drm buddy TOPDOWN flag
is prone to meet the condition in amdgpu_is_vram_mgr_blocks_contiguous, which means that the blocks are contiguous.

As you mentioned that issue https://gitlab.freedesktop.org/drm/amd/-/issues/2270 , one of the reasons may be that TOPDOWN
flag in drm buddy can make it easier to get contiguous VRAM with the size of not power of two.
This means that we do not need allocate contiguous VRAM again and then do copy for display something.
In such way, we can get good performance and reduce the risk of allocation failures.
If so, it's necessary to use TOPDOWN flag for gfx.
The reason behind setting TOPDOWN flag is that we don't hold the small 
blocks randomly preventing the merge back into larger sized blocks.
Enabling TOPDOWN flag allocates the smaller sized blocks from the higher 
orders which are split already for the previous requests, thus we
don't disturb the lower order or middle order blocks. Hence we don't hit 
fragmentation issues.


Hi Felix & Christian,
However, for compute, there is suballocator on ROCr to allocate power of two sizes from drm buddy.
If we use the TOPDOWN flag,  there is a potential problem that split higher order into lower order.
Exactly, this is what we want to solve the fragmentation problems.
It may increase the risk of allocating higher order blocks under memory pressure if we split too much higher order into lower order.
This is not a risk, as I explained previously we dont disturb the lower 
or middle order blocks, hence we could see improvement in performance.

If so, my question is that do we need to consider using different allocator strategies on gfx and compute in large bar system?

Best Regards,
Shane

Regards,
Arun.
Best Regards,
Shane

If TOPDOWN has these general benefits, then would it make sense to
allocate visible memory TOPDOWN as well, on large-BAR GPUs? Without
knowing too much about the internals of the allocator, my intuition
tells me that using only one allocation strategy has a better chance
of minimizing fragmentation than mixing two allocation strategies for no
good reason.
Regards,
     Felix


Hence, we are reverting this patch.

Gitlab issue link -
https://gitlab.freedesktop.org/drm/amd/-/issues/2270

Signed-off-by: Arunpravin Paneer Selvam
<Arunpravin.PaneerSelvam@xxxxxxx>
---
    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
    1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2bd1a54ee866..ca5fc07faf6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -139,7 +139,7 @@ void amdgpu_bo_placement_from_domain(struct
amdgpu_bo *abo, u32 domain)

              if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
                      places[c].lpfn = visible_pfn;
-           else if (adev->gmc.real_vram_size != adev-
gmc.visible_vram_size)
+           else
                      places[c].flags |= TTM_PL_FLAG_TOPDOWN;

              if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)