RE: [PATCH] Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
> Sent: Sunday, May 21, 2023 2:39 AM
> To: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx>; amd-
> gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian
> <Christian.Koenig@xxxxxxx>; Xiao, Shane <shane.xiao@xxxxxxx>
> Subject: Re: [PATCH] Revert "drm/amdgpu: remove TOPDOWN flags when
> allocating VRAM in large bar system"
>
> Am 2023-05-20 um 05:25 schrieb Arunpravin Paneer Selvam:
> > This reverts commit c105518679b6e87232874ffc989ec403bee59664.
> >
> > This patch disables the TOPDOWN flag for APU and few dGPU cards which
> > has the VRAM size equal to the BAR size.
>
> With resizable BARs it's not that rare.
>
>
> >
> > When we enable the TOPDOWN flag, we get the free blocks at the highest
> > available memory region and we don't split the lower order blocks.
> > This change is required to keep off the fragmentation related issues
> > particularly in ASIC which has VRAM space <= 500MiB

As far as I can understand that, both ways may cause fragmentation issues.

As I can understand that remove TOPDOWN flags may get two pros:
1) It can reduce the research time to O(1);
2) Reduce the risk of splitting higher order into lower orders when allocating blocks.

But as for the second point, Removing TOPDOWN flags still carries the risk of causing
fragmentation issue on some scenarios.
For example, the apps need allocate (1M + 4K) VRAM size, which is not a power of two.

Case 1: If not using TOPDOWN flag, we will allocate 1M order size and 4K order size from different order.
--And then if this 4K buddy and 1M buddy are freed, it may prevent the system from merging 4K(or 1M) to 8K(2M) order.
--If the app has many such allocation requirements, it may cause fragmentation issue under memory stress.

Case 2: If using TOPDOWN flag, we will find the highest order which is suitable for 4K and 1M needs.
--Assuming the highest order size is 8M, then we will split this 8M to lower order to meet the needs of 1M and 4K sizes, respectively.
--In such case the 8M size will split into different lower orders.
--If any other thread or process need allocate 4K or 1M VRAM, and there is also a chance to prevent the system from merging 4K(or 1M) to 8K(or 2M) order.

If there are so many needs to allocate not power of two pages from apps,   one of the choose is to use suballocator from UMD such as what ROCr does.
That means suballocator can allocate power of two pages(for example 2M bytes) from driver, and the app can allocate memory from suballocator.
In such way, it may reduce the risk of causing fragmentation issues.  But I am not sure that such an option is feasible.

Maybe case 2 occurs less frequently than case one, then we should use TOPDOWN flags whether the system is lar-bar or not.

Best Regards,
Shane

>
> If TOPDOWN has these general benefits, then would it make sense to allocate
> visible memory TOPDOWN as well, on large-BAR GPUs? Without knowing too
> much about the internals of the allocator, my intuition tells me that using only
> one allocation strategy has a better chance of minimizing fragmentation than
> mixing two allocation strategies for no good reason.
>
> Regards,
>    Felix
>
>
> >
> > Hence, we are reverting this patch.
> >
> > Gitlab issue link -
> > https://gitlab.freedesktop.org/drm/amd/-/issues/2270
> >
> > Signed-off-by: Arunpravin Paneer Selvam
> > <Arunpravin.PaneerSelvam@xxxxxxx>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 2bd1a54ee866..ca5fc07faf6c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -139,7 +139,7 @@ void amdgpu_bo_placement_from_domain(struct
> > amdgpu_bo *abo, u32 domain)
> >
> >             if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
> >                     places[c].lpfn = visible_pfn;
> > -           else if (adev->gmc.real_vram_size != adev-
> >gmc.visible_vram_size)
> > +           else
> >                     places[c].flags |= TTM_PL_FLAG_TOPDOWN;
> >
> >             if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux