Am 10.03.25 um 18:29 schrieb Natalie Vock: > On 07.03.25 09:39, Christian König wrote: >> Am 06.03.25 um 18:01 schrieb Natalie Vock: >>> When userspace requests buffers to be placed into GTT | VRAM, it is >>> requesting the buffer to be placed into either of these domains. If the >>> buffer fits into VRAM but does not fit into GTT, then let the buffer >>> reside in VRAM instead of failing allocation entirely. >> >> That will completely break suspend/resume on laptops. >> >> we essentially need to always check if a BO can fit into GTT. > > Good point, I forgot about suspend. But as you say, we always need to do > this, even for VRAM-only BOs. I'll send a v2 for that in a minute. IIRC we used to have that at some point, but removed it for reasons I don't fully remember any more. We also support over-committing GTT on suspend. In other words the GTT limit is not relevant any more and we rather just try to allocate from system memory directly. But at least current we still absolutely need to fit a BOs into the available system memory on suspend, but as soon as that is done the BO can go to swap immediately. Regards, Christian. > > Thanks, Natalie > >> >>> >>> Reported-by: Ivan Avdeev <1@xxxxxxxxx> >>> Signed-off-by: Natalie Vock <natalie.vock@xxxxxx> >>> --- >>> This originally came up in https://gitlab.freedesktop.org/mesa/mesa/-/issues/12713: >>> The crux of the issue is that some applications expect they can allocate >>> buffers up to the size of VRAM - however, some setups have a >>> smaller-than-VRAM GTT region. RADV always sets VRAM | GTT for all buffer >>> allocations, so this causes amdgpu to reject the allocation entirely >>> because it cannot fit into GTT, even though it could fit into VRAM. >>> >>> In my opinion, this check doesn't make sense: It is completely valid >>> behavior for the kernel to always keep a VRAM | GTT buffer in VRAM. >> >> No it isn't. On suspend the power to VRAM is turned off and so we always need to be able to evacuate buffers into GTT. >> >> Regards, >> Christian. >> >>> The only case where buffers larger than the GTT size are special is that >>> they cannot be evicted to GTT (or swapped out), but the kernel already >>> allows VRAM-only buffers to be larger than GTT, and those cannot be >>> swapped out either. With the check removed, VRAM | GTT buffers larger >>> than GTT behave exactly like VRAM-only buffers larger than GTT. >>> >>> The patch adding this check seems to have added it in a v2 - however I >>> was unable to find any public discussion around the patch with reasoning >>> in favor of this check. >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 30 ++++++++++------------ >>> 1 file changed, 14 insertions(+), 16 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> index d09db052e282d..b5e5fea091bf1 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> @@ -555,27 +555,25 @@ static bool amdgpu_bo_validate_size(struct amdgpu_device *adev, >>> { >>> struct ttm_resource_manager *man = NULL; >>> >>> - /* >>> - * If GTT is part of requested domains the check must succeed to >>> - * allow fall back to GTT. >>> - */ >>> - if (domain & AMDGPU_GEM_DOMAIN_GTT) >>> - man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT); >>> - else if (domain & AMDGPU_GEM_DOMAIN_VRAM) >>> - man = ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM); >>> - else >>> + /* TODO add more domains checks, such as AMDGPU_GEM_DOMAIN_CPU, _DOMAIN_DOORBELL */ >>> + if (!(domain & (AMDGPU_GEM_DOMAIN_GTT | AMDGPU_GEM_DOMAIN_VRAM))) >>> return true; >>> >>> - if (!man) { >>> - if (domain & AMDGPU_GEM_DOMAIN_GTT) >>> + if (domain & AMDGPU_GEM_DOMAIN_VRAM) { >>> + man = ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM); >>> + if (size < man->size) >>> + return true; >>> + } >>> + if (domain & AMDGPU_GEM_DOMAIN_GTT) { >>> + man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT); >>> + if (!man) { >>> WARN_ON_ONCE("GTT domain requested but GTT mem manager uninitialized"); >>> - return false; >>> + return false; >>> + } >>> + if (size < man->size) >>> + return true; >>> } >>> >>> - /* TODO add more domains checks, such as AMDGPU_GEM_DOMAIN_CPU, _DOMAIN_DOORBELL */ >>> - if (size < man->size) >>> - return true; >>> - >>> DRM_DEBUG("BO size %lu > total memory in domain: %llu\n", size, man->size); >>> return false; >>> } >>> -- >>> 2.48.1 >>> >> >