I could try it on my carrizo/polaris setup. Is there a test procedure I could folllow to trigger the changed code paths? Tom On 2019-10-31 6:41 a.m., Koenig, Christian wrote: > Just tested this and amdgpu_vm_update_ptes() indeed works as expected. > > When you free at least a 2MB the lowest level of page tables is freed > up again. > > BTW: What hardware have you tested this on? On gfx8 and older it is > expected that page tables are never freed. > > Regards, > Christian. > > Am 30.10.19 um 19:11 schrieb Christian König: >> Then I don't see how this patch actually changes anything. >> >> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate >> this, but I won't have time to look into the ticket in detail. >> >> Regards, >> Christian. >> >> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric: >>> >>> Actually I do prevent to remove in-use pts by this: >>> >>> + r = amdgpu_vm_remove_ptes(adev, vm, >>> + (mapping->start + 0x1ff) & (~0x1ffll), >>> + (mapping->last + 1) & (~0x1ffll)); >>> >>> Which is only removing aligned page table for 2M. And I have tested >>> it at least on KFD tests without anything broken. >>> >>> By the way, I am not familiar with memory staff. This patch is the >>> best I can do for now. Could you take a look at the Jira ticket >>> SWDEV-201443 ? and find the better solution. Thanks! >>> >>> Regards, >>> >>> Eric >>> >>> On 2019-10-30 1:57 p.m., Christian König wrote: >>>> One thing I've forgotten: >>>> >>>> What you could maybe do to improve the situation is to join >>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how >>>> the chances are that the ranges are freed all together. >>>> >>>> The only other alternative I can see would be to check the mappings >>>> of a range in amdgpu_update_ptes() and see if you could walk the >>>> tree up if the valid flag is not set and there are no mappings left >>>> for a page table. >>>> >>>> Regards, >>>> Christian. >>>> >>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian: >>>>>> The vaild flag doesn't take effect in this function. >>>>> That's irrelevant. >>>>> >>>>> See what amdgpu_vm_update_ptes() does is to first determine the >>>>> fragment size: >>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end); >>>>> >>>>> Then we walk down the tree: >>>>>> amdgpu_vm_pt_start(adev, params->vm, start, &cursor); >>>>>> while (cursor.pfn < end) { >>>>> >>>>> And make sure that the page tables covering the address range are >>>>> actually allocated: >>>>>> r = amdgpu_vm_alloc_pts(params->adev, params->vm, >>>>>> &cursor); >>>>> >>>>> Then we update the tables with the flags and addresses and free up >>>>> subsequent tables in the case of huge pages or freed up areas: >>>>>> /* Free all child entries */ >>>>>> while (cursor.pfn < frag_start) { >>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor); >>>>>> amdgpu_vm_pt_next(adev, &cursor); >>>>>> } >>>>> >>>>> This is the maximum you can free, cause all other page tables are >>>>> not completely covered by the range and so potentially still in use. >>>>> >>>>> And I have the strong suspicion that this is what your patch is >>>>> actually doing wrong. In other words you are also freeing page >>>>> tables which are only partially covered by the range and so >>>>> potentially still in use. >>>>> >>>>> Since we don't have any tracking how many entries in a page table >>>>> are currently valid and how many are invalid we actually can't >>>>> implement what you are trying to do here. So the patch is >>>>> definitely somehow broken. >>>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric: >>>>>> >>>>>> The vaild flag doesn't take effect in this function. >>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on >>>>>> "cursor.pfn < end". The valid flag has only been checked on here >>>>>> for asic below GMC v9: >>>>>> >>>>>> if (adev->asic_type < CHIP_VEGA10 && >>>>>> (flags & AMDGPU_PTE_VALID))... >>>>>> >>>>>> Regards, >>>>>> >>>>>> Eric >>>>>> >>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote: >>>>>>> >>>>>>> >>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" >>>>>>> <JinHuiEric.Huang@xxxxxxx>: >>>>>>> >>>>>>> I tested it that it saves a lot of vram on KFD big buffer >>>>>>> stress test. I think there are two reasons: >>>>>>> >>>>>>> 1. Calling amdgpu_vm_update_ptes() during unmapping will >>>>>>> allocate unnecessary pts, because there is no flag to >>>>>>> determine if the VA is mapping or unmapping in function >>>>>>> amdgpu_vm_update_ptes(). It saves the most of memory. >>>>>>> >>>>>>> That's not correct. The valid flag is used for this. >>>>>>> >>>>>>> 2. Intentionally removing those unmapping pts is logical >>>>>>> expectation, although it is not removing so much pts. >>>>>>> >>>>>>> Well I actually don't see a change to what update_ptes is doing >>>>>>> and have the strong suspicion that the patch is simply broken. >>>>>>> >>>>>>> You either free page tables which are potentially still in use >>>>>>> or update_pte doesn't free page tables when the valid but is not >>>>>>> set. >>>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Eric >>>>>>> >>>>>>> On 2019-10-30 11:57 a.m., Koenig, Christian wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Am 30.10.2019 16:47 schrieb "Kuehling, Felix" >>>>>>> <Felix.Kuehling@xxxxxxx> <mailto:Felix.Kuehling@xxxxxxx>: >>>>>>> >>>>>>> On 2019-10-30 9:52 a.m., Christian König wrote: >>>>>>> > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric: >>>>>>> >> The issue is PT BOs are not freed when unmapping VA, >>>>>>> >> which causes vram usage accumulated is huge in some >>>>>>> >> memory stress test, such as kfd big buffer stress >>>>>>> test. >>>>>>> >> Function amdgpu_vm_bo_update_mapping() is called >>>>>>> by both >>>>>>> >> amdgpu_vm_bo_update() and >>>>>>> amdgpu_vm_clear_freed(). The >>>>>>> >> solution is replacing >>>>>>> amdgpu_vm_bo_update_mapping() in >>>>>>> >> amdgpu_vm_clear_freed() with removing PT BOs function >>>>>>> >> to save vram usage. >>>>>>> > >>>>>>> > NAK, that is intentional behavior. >>>>>>> > >>>>>>> > Otherwise we can run into out of memory situations >>>>>>> when page tables >>>>>>> > need to be allocated again under stress. >>>>>>> >>>>>>> That's a bit arbitrary and inconsistent. We are >>>>>>> freeing page tables in >>>>>>> other situations, when a mapping uses huge pages in >>>>>>> amdgpu_vm_update_ptes. Why not when a mapping is >>>>>>> destroyed completely? >>>>>>> >>>>>>> I'm actually a bit surprised that the huge-page >>>>>>> handling in >>>>>>> amdgpu_vm_update_ptes isn't kicking in to free up >>>>>>> lower-level page >>>>>>> tables when a BO is unmapped. >>>>>>> >>>>>>> >>>>>>> Well it does free the lower level, and that is already >>>>>>> causing problems (that's why I added the reserved space). >>>>>>> >>>>>>> What we don't do is freeing the higher levels. >>>>>>> >>>>>>> E.g. when you free a 2MB BO we free the lowest level, if >>>>>>> we free a 1GB BO we free the two lowest levels etc... >>>>>>> >>>>>>> The problem with freeing the higher levels is that you >>>>>>> don't know who is also using this. E.g. we would need to >>>>>>> check all entries when we unmap one. >>>>>>> >>>>>>> It's simply not worth it for a maximum saving of 2MB per VM. >>>>>>> >>>>>>> Writing this I'm actually wondering how you ended up in >>>>>>> this issue? There shouldn't be much savings from this. >>>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Felix >>>>>>> >>>>>>> >>>>>>> > >>>>>>> > Regards, >>>>>>> > Christian. >>>>>>> > >>>>>>> >> >>>>>>> >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924 >>>>>>> >> Signed-off-by: Eric Huang >>>>>>> <JinhuiEric.Huang@xxxxxxx> >>>>>>> <mailto:JinhuiEric.Huang@xxxxxxx> >>>>>>> >> --- >>>>>>> >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 >>>>>>> >> +++++++++++++++++++++++++++++----- >>>>>>> >> 1 file changed, 48 insertions(+), 8 deletions(-) >>>>>>> >> >>>>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> >> index 0f4c3b2..8a480c7 100644 >>>>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> >> @@ -1930,6 +1930,51 @@ static void >>>>>>> amdgpu_vm_prt_fini(struct >>>>>>> >> amdgpu_device *adev, struct amdgpu_vm *vm) >>>>>>> >> } >>>>>>> >> /** >>>>>>> >> + * amdgpu_vm_remove_ptes - free PT BOs >>>>>>> >> + * >>>>>>> >> + * @adev: amdgpu device structure >>>>>>> >> + * @vm: amdgpu vm structure >>>>>>> >> + * @start: start of mapped range >>>>>>> >> + * @end: end of mapped entry >>>>>>> >> + * >>>>>>> >> + * Free the page table level. >>>>>>> >> + */ >>>>>>> >> +static int amdgpu_vm_remove_ptes(struct >>>>>>> amdgpu_device *adev, >>>>>>> >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end) >>>>>>> >> +{ >>>>>>> >> + struct amdgpu_vm_pt_cursor cursor; >>>>>>> >> + unsigned shift, num_entries; >>>>>>> >> + >>>>>>> >> + amdgpu_vm_pt_start(adev, vm, start, &cursor); >>>>>>> >> + while (cursor.level < AMDGPU_VM_PTB) { >>>>>>> >> + if (!amdgpu_vm_pt_descendant(adev, &cursor)) >>>>>>> >> + return -ENOENT; >>>>>>> >> + } >>>>>>> >> + >>>>>>> >> + while (cursor.pfn < end) { >>>>>>> >> + amdgpu_vm_free_table(cursor.entry); >>>>>>> >> + num_entries = amdgpu_vm_num_entries(adev, >>>>>>> cursor.level - 1); >>>>>>> >> + >>>>>>> >> + if (cursor.entry != >>>>>>> &cursor.parent->entries[num_entries - 1]) { >>>>>>> >> + /* Next ptb entry */ >>>>>>> >> + shift = amdgpu_vm_level_shift(adev, >>>>>>> cursor.level - 1); >>>>>>> >> + cursor.pfn += 1ULL << shift; >>>>>>> >> + cursor.pfn &= ~((1ULL << shift) - 1); >>>>>>> >> + cursor.entry++; >>>>>>> >> + } else { >>>>>>> >> + /* Next ptb entry in next pd0 entry */ >>>>>>> >> + amdgpu_vm_pt_ancestor(&cursor); >>>>>>> >> + shift = amdgpu_vm_level_shift(adev, >>>>>>> cursor.level - 1); >>>>>>> >> + cursor.pfn += 1ULL << shift; >>>>>>> >> + cursor.pfn &= ~((1ULL << shift) - 1); >>>>>>> >> + amdgpu_vm_pt_descendant(adev, &cursor); >>>>>>> >> + } >>>>>>> >> + } >>>>>>> >> + >>>>>>> >> + return 0; >>>>>>> >> +} >>>>>>> >> + >>>>>>> >> +/** >>>>>>> >> * amdgpu_vm_clear_freed - clear freed BOs in >>>>>>> the PT >>>>>>> >> * >>>>>>> >> * @adev: amdgpu_device pointer >>>>>>> >> @@ -1949,7 +1994,6 @@ int >>>>>>> amdgpu_vm_clear_freed(struct amdgpu_device >>>>>>> >> *adev, >>>>>>> >> struct dma_fence **fence) >>>>>>> >> { >>>>>>> >> struct amdgpu_bo_va_mapping *mapping; >>>>>>> >> - uint64_t init_pte_value = 0; >>>>>>> >> struct dma_fence *f = NULL; >>>>>>> >> int r; >>>>>>> >> @@ -1958,13 +2002,10 @@ int >>>>>>> amdgpu_vm_clear_freed(struct >>>>>>> >> amdgpu_device *adev, >>>>>>> >> struct amdgpu_bo_va_mapping, list); >>>>>>> >> list_del(&mapping->list); >>>>>>> >> - if (vm->pte_support_ats && >>>>>>> >> - mapping->start < AMDGPU_GMC_HOLE_START) >>>>>>> >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC; >>>>>>> >> + r = amdgpu_vm_remove_ptes(adev, vm, >>>>>>> >> + (mapping->start + 0x1ff) & (~0x1ffll), >>>>>>> >> + (mapping->last + 1) & (~0x1ffll)); >>>>>>> >> - r = amdgpu_vm_bo_update_mapping(adev, >>>>>>> vm, false, NULL, >>>>>>> >> - mapping->start, mapping->last, >>>>>>> >> - init_pte_value, 0, NULL, &f); >>>>>>> >> amdgpu_vm_free_mapping(adev, vm, mapping, f); >>>>>>> >> if (r) { >>>>>>> >> dma_fence_put(f); >>>>>>> >> @@ -1980,7 +2021,6 @@ int >>>>>>> amdgpu_vm_clear_freed(struct amdgpu_device >>>>>>> >> *adev, >>>>>>> >> } >>>>>>> >> return 0; >>>>>>> >> - >>>>>>> >> } >>>>>>> >> /** >>>>>>> > >>>>>>> > _______________________________________________ >>>>>>> > amd-gfx mailing list >>>>>>> > amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx> >>>>>>> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> amd-gfx mailing list >>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>> >> > > > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx