Re: [PATCH] drm/amdgpu: only use kernel zone if need_dma32 is not required

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-06-12 3:28 p.m., Christian König wrote:
> Am 12.06.19 um 17:13 schrieb Yang, Philip:
>> TTM create two zones, kernel zone and dma32 zone for system memory. If
>> system memory address allocated is below 4GB, this account to dma32 zone
>> and will exhaust dma32 zone and trigger unnesssary TTM eviction.
>>
>> Patch "drm/ttm: Account for kernel allocations in kernel zone only" only
>> handle the allocation for acc_size, the system memory page allocation is
>> through ttm_mem_global_alloc_page which still account to dma32 zone if
>> page is below 4GB.
> 
> NAK, as the name says the mem_glob is global for all devices in the system.
> 
> So this will break if you mix DMA32 and non DMA32 in the same system 
> which is exactly the configuration my laptop here has :(
>
I didn't find path use dma32 zone, but I may missed something. There is 
an issue found by KFDTest.BigBufStressTest, it allocates buffers up to 
3/8 of total 256GB system memory, each buffer size is 128MB, then use 
queue to write to the buffers. If ttm_mem_global_alloc_page get page pfn 
is below 4GB, it account to dma32 zone and will exhaust 2GB limit, then 
ttm_check_swapping will schedule ttm_shrink_work to start eviction. It 
takes minutes to finish restore (retry many times if busy), the test 
failed because queue timeout. This eviction is unnecessary because we 
still have enough free system memory.

It's random case, happens about 1/5. I can change test to increase the 
timeout value to workaround this, but this seems TTM bug. This will slow 
application performance a lot if this random issue happens.

Thanks,
Philip


> Christian.
> 
>>
>> Change-Id: I289b85d891b8f64a1422c42b1eab398098ab7ef7
>> Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 2778ff63d97d..79bb9dfe617b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -1686,6 +1686,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>       }
>>       adev->mman.initialized = true;
>> +    /* Only kernel zone (no dma32 zone) if device does not require 
>> dma32 */
>> +    if (!adev->need_dma32)
>> +        adev->mman.bdev.glob->mem_glob->num_zones = 1;
>> +
>>       /* We opt to avoid OOM on system pages allocations */
>>       adev->mman.bdev.no_retry = true;
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux