On 2019-06-12 3:28 p.m., Christian König wrote: > Am 12.06.19 um 17:13 schrieb Yang, Philip: >> TTM create two zones, kernel zone and dma32 zone for system memory. If >> system memory address allocated is below 4GB, this account to dma32 zone >> and will exhaust dma32 zone and trigger unnesssary TTM eviction. >> >> Patch "drm/ttm: Account for kernel allocations in kernel zone only" only >> handle the allocation for acc_size, the system memory page allocation is >> through ttm_mem_global_alloc_page which still account to dma32 zone if >> page is below 4GB. > > NAK, as the name says the mem_glob is global for all devices in the system. > > So this will break if you mix DMA32 and non DMA32 in the same system > which is exactly the configuration my laptop here has :( > I didn't find path use dma32 zone, but I may missed something. There is an issue found by KFDTest.BigBufStressTest, it allocates buffers up to 3/8 of total 256GB system memory, each buffer size is 128MB, then use queue to write to the buffers. If ttm_mem_global_alloc_page get page pfn is below 4GB, it account to dma32 zone and will exhaust 2GB limit, then ttm_check_swapping will schedule ttm_shrink_work to start eviction. It takes minutes to finish restore (retry many times if busy), the test failed because queue timeout. This eviction is unnecessary because we still have enough free system memory. It's random case, happens about 1/5. I can change test to increase the timeout value to workaround this, but this seems TTM bug. This will slow application performance a lot if this random issue happens. Thanks, Philip > Christian. > >> >> Change-Id: I289b85d891b8f64a1422c42b1eab398098ab7ef7 >> Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> index 2778ff63d97d..79bb9dfe617b 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >> @@ -1686,6 +1686,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev) >> } >> adev->mman.initialized = true; >> + /* Only kernel zone (no dma32 zone) if device does not require >> dma32 */ >> + if (!adev->need_dma32) >> + adev->mman.bdev.glob->mem_glob->num_zones = 1; >> + >> /* We opt to avoid OOM on system pages allocations */ >> adev->mman.bdev.no_retry = true; > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx