The hardware is vega10 and test is KFDMemoryTest.BigBufferStressTest. More detail is on Jira SWDEV-201443. Regards, Eric On 2019-10-31 10:08 a.m., StDenis, Tom wrote: > I could try it on my carrizo/polaris setup. Is there a test procedure I > could folllow to trigger the changed code paths? > > > Tom > > On 2019-10-31 6:41 a.m., Koenig, Christian wrote: >> Just tested this and amdgpu_vm_update_ptes() indeed works as expected. >> >> When you free at least a 2MB the lowest level of page tables is freed >> up again. >> >> BTW: What hardware have you tested this on? On gfx8 and older it is >> expected that page tables are never freed. >> >> Regards, >> Christian. >> >> Am 30.10.19 um 19:11 schrieb Christian König: >>> Then I don't see how this patch actually changes anything. >>> >>> Could only be a bug in amdgpu_vm_update_ptes(). Going to investigate >>> this, but I won't have time to look into the ticket in detail. >>> >>> Regards, >>> Christian. >>> >>> Am 30.10.19 um 19:00 schrieb Huang, JinHuiEric: >>>> Actually I do prevent to remove in-use pts by this: >>>> >>>> + r = amdgpu_vm_remove_ptes(adev, vm, >>>> + (mapping->start + 0x1ff) & (~0x1ffll), >>>> + (mapping->last + 1) & (~0x1ffll)); >>>> >>>> Which is only removing aligned page table for 2M. And I have tested >>>> it at least on KFD tests without anything broken. >>>> >>>> By the way, I am not familiar with memory staff. This patch is the >>>> best I can do for now. Could you take a look at the Jira ticket >>>> SWDEV-201443 ? and find the better solution. Thanks! >>>> >>>> Regards, >>>> >>>> Eric >>>> >>>> On 2019-10-30 1:57 p.m., Christian König wrote: >>>>> One thing I've forgotten: >>>>> >>>>> What you could maybe do to improve the situation is to join >>>>> adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how >>>>> the chances are that the ranges are freed all together. >>>>> >>>>> The only other alternative I can see would be to check the mappings >>>>> of a range in amdgpu_update_ptes() and see if you could walk the >>>>> tree up if the valid flag is not set and there are no mappings left >>>>> for a page table. >>>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>> Am 30.10.19 um 18:42 schrieb Koenig, Christian: >>>>>>> The vaild flag doesn't take effect in this function. >>>>>> That's irrelevant. >>>>>> >>>>>> See what amdgpu_vm_update_ptes() does is to first determine the >>>>>> fragment size: >>>>>>> amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end); >>>>>> Then we walk down the tree: >>>>>>> amdgpu_vm_pt_start(adev, params->vm, start, &cursor); >>>>>>> while (cursor.pfn < end) { >>>>>> And make sure that the page tables covering the address range are >>>>>> actually allocated: >>>>>>> r = amdgpu_vm_alloc_pts(params->adev, params->vm, >>>>>>> &cursor); >>>>>> Then we update the tables with the flags and addresses and free up >>>>>> subsequent tables in the case of huge pages or freed up areas: >>>>>>> /* Free all child entries */ >>>>>>> while (cursor.pfn < frag_start) { >>>>>>> amdgpu_vm_free_pts(adev, params->vm, &cursor); >>>>>>> amdgpu_vm_pt_next(adev, &cursor); >>>>>>> } >>>>>> This is the maximum you can free, cause all other page tables are >>>>>> not completely covered by the range and so potentially still in use. >>>>>> >>>>>> And I have the strong suspicion that this is what your patch is >>>>>> actually doing wrong. In other words you are also freeing page >>>>>> tables which are only partially covered by the range and so >>>>>> potentially still in use. >>>>>> >>>>>> Since we don't have any tracking how many entries in a page table >>>>>> are currently valid and how many are invalid we actually can't >>>>>> implement what you are trying to do here. So the patch is >>>>>> definitely somehow broken. >>>>>> >>>>>> Regards, >>>>>> Christian. >>>>>> >>>>>> Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric: >>>>>>> The vaild flag doesn't take effect in this function. >>>>>>> amdgpu_vm_alloc_pts() is always executed that only depended on >>>>>>> "cursor.pfn < end". The valid flag has only been checked on here >>>>>>> for asic below GMC v9: >>>>>>> >>>>>>> if (adev->asic_type < CHIP_VEGA10 && >>>>>>> (flags & AMDGPU_PTE_VALID))... >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Eric >>>>>>> >>>>>>> On 2019-10-30 12:30 p.m., Koenig, Christian wrote: >>>>>>>> >>>>>>>> Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" >>>>>>>> <JinHuiEric.Huang@xxxxxxx>: >>>>>>>> >>>>>>>> I tested it that it saves a lot of vram on KFD big buffer >>>>>>>> stress test. I think there are two reasons: >>>>>>>> >>>>>>>> 1. Calling amdgpu_vm_update_ptes() during unmapping will >>>>>>>> allocate unnecessary pts, because there is no flag to >>>>>>>> determine if the VA is mapping or unmapping in function >>>>>>>> amdgpu_vm_update_ptes(). It saves the most of memory. >>>>>>>> >>>>>>>> That's not correct. The valid flag is used for this. >>>>>>>> >>>>>>>> 2. Intentionally removing those unmapping pts is logical >>>>>>>> expectation, although it is not removing so much pts. >>>>>>>> >>>>>>>> Well I actually don't see a change to what update_ptes is doing >>>>>>>> and have the strong suspicion that the patch is simply broken. >>>>>>>> >>>>>>>> You either free page tables which are potentially still in use >>>>>>>> or update_pte doesn't free page tables when the valid but is not >>>>>>>> set. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Eric >>>>>>>> >>>>>>>> On 2019-10-30 11:57 a.m., Koenig, Christian wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Am 30.10.2019 16:47 schrieb "Kuehling, Felix" >>>>>>>> <Felix.Kuehling@xxxxxxx> <mailto:Felix.Kuehling@xxxxxxx>: >>>>>>>> >>>>>>>> On 2019-10-30 9:52 a.m., Christian König wrote: >>>>>>>> > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric: >>>>>>>> >> The issue is PT BOs are not freed when unmapping VA, >>>>>>>> >> which causes vram usage accumulated is huge in some >>>>>>>> >> memory stress test, such as kfd big buffer stress >>>>>>>> test. >>>>>>>> >> Function amdgpu_vm_bo_update_mapping() is called >>>>>>>> by both >>>>>>>> >> amdgpu_vm_bo_update() and >>>>>>>> amdgpu_vm_clear_freed(). The >>>>>>>> >> solution is replacing >>>>>>>> amdgpu_vm_bo_update_mapping() in >>>>>>>> >> amdgpu_vm_clear_freed() with removing PT BOs function >>>>>>>> >> to save vram usage. >>>>>>>> > >>>>>>>> > NAK, that is intentional behavior. >>>>>>>> > >>>>>>>> > Otherwise we can run into out of memory situations >>>>>>>> when page tables >>>>>>>> > need to be allocated again under stress. >>>>>>>> >>>>>>>> That's a bit arbitrary and inconsistent. We are >>>>>>>> freeing page tables in >>>>>>>> other situations, when a mapping uses huge pages in >>>>>>>> amdgpu_vm_update_ptes. Why not when a mapping is >>>>>>>> destroyed completely? >>>>>>>> >>>>>>>> I'm actually a bit surprised that the huge-page >>>>>>>> handling in >>>>>>>> amdgpu_vm_update_ptes isn't kicking in to free up >>>>>>>> lower-level page >>>>>>>> tables when a BO is unmapped. >>>>>>>> >>>>>>>> >>>>>>>> Well it does free the lower level, and that is already >>>>>>>> causing problems (that's why I added the reserved space). >>>>>>>> >>>>>>>> What we don't do is freeing the higher levels. >>>>>>>> >>>>>>>> E.g. when you free a 2MB BO we free the lowest level, if >>>>>>>> we free a 1GB BO we free the two lowest levels etc... >>>>>>>> >>>>>>>> The problem with freeing the higher levels is that you >>>>>>>> don't know who is also using this. E.g. we would need to >>>>>>>> check all entries when we unmap one. >>>>>>>> >>>>>>>> It's simply not worth it for a maximum saving of 2MB per VM. >>>>>>>> >>>>>>>> Writing this I'm actually wondering how you ended up in >>>>>>>> this issue? There shouldn't be much savings from this. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> Felix >>>>>>>> >>>>>>>> >>>>>>>> > >>>>>>>> > Regards, >>>>>>>> > Christian. >>>>>>>> > >>>>>>>> >> >>>>>>>> >> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924 >>>>>>>> >> Signed-off-by: Eric Huang >>>>>>>> <JinhuiEric.Huang@xxxxxxx> >>>>>>>> <mailto:JinhuiEric.Huang@xxxxxxx> >>>>>>>> >> --- >>>>>>>> >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 >>>>>>>> >> +++++++++++++++++++++++++++++----- >>>>>>>> >> 1 file changed, 48 insertions(+), 8 deletions(-) >>>>>>>> >> >>>>>>>> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>>> >> index 0f4c3b2..8a480c7 100644 >>>>>>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>>> >> @@ -1930,6 +1930,51 @@ static void >>>>>>>> amdgpu_vm_prt_fini(struct >>>>>>>> >> amdgpu_device *adev, struct amdgpu_vm *vm) >>>>>>>> >> } >>>>>>>> >> /** >>>>>>>> >> + * amdgpu_vm_remove_ptes - free PT BOs >>>>>>>> >> + * >>>>>>>> >> + * @adev: amdgpu device structure >>>>>>>> >> + * @vm: amdgpu vm structure >>>>>>>> >> + * @start: start of mapped range >>>>>>>> >> + * @end: end of mapped entry >>>>>>>> >> + * >>>>>>>> >> + * Free the page table level. >>>>>>>> >> + */ >>>>>>>> >> +static int amdgpu_vm_remove_ptes(struct >>>>>>>> amdgpu_device *adev, >>>>>>>> >> + struct amdgpu_vm *vm, uint64_t start, uint64_t end) >>>>>>>> >> +{ >>>>>>>> >> + struct amdgpu_vm_pt_cursor cursor; >>>>>>>> >> + unsigned shift, num_entries; >>>>>>>> >> + >>>>>>>> >> + amdgpu_vm_pt_start(adev, vm, start, &cursor); >>>>>>>> >> + while (cursor.level < AMDGPU_VM_PTB) { >>>>>>>> >> + if (!amdgpu_vm_pt_descendant(adev, &cursor)) >>>>>>>> >> + return -ENOENT; >>>>>>>> >> + } >>>>>>>> >> + >>>>>>>> >> + while (cursor.pfn < end) { >>>>>>>> >> + amdgpu_vm_free_table(cursor.entry); >>>>>>>> >> + num_entries = amdgpu_vm_num_entries(adev, >>>>>>>> cursor.level - 1); >>>>>>>> >> + >>>>>>>> >> + if (cursor.entry != >>>>>>>> &cursor.parent->entries[num_entries - 1]) { >>>>>>>> >> + /* Next ptb entry */ >>>>>>>> >> + shift = amdgpu_vm_level_shift(adev, >>>>>>>> cursor.level - 1); >>>>>>>> >> + cursor.pfn += 1ULL << shift; >>>>>>>> >> + cursor.pfn &= ~((1ULL << shift) - 1); >>>>>>>> >> + cursor.entry++; >>>>>>>> >> + } else { >>>>>>>> >> + /* Next ptb entry in next pd0 entry */ >>>>>>>> >> + amdgpu_vm_pt_ancestor(&cursor); >>>>>>>> >> + shift = amdgpu_vm_level_shift(adev, >>>>>>>> cursor.level - 1); >>>>>>>> >> + cursor.pfn += 1ULL << shift; >>>>>>>> >> + cursor.pfn &= ~((1ULL << shift) - 1); >>>>>>>> >> + amdgpu_vm_pt_descendant(adev, &cursor); >>>>>>>> >> + } >>>>>>>> >> + } >>>>>>>> >> + >>>>>>>> >> + return 0; >>>>>>>> >> +} >>>>>>>> >> + >>>>>>>> >> +/** >>>>>>>> >> * amdgpu_vm_clear_freed - clear freed BOs in >>>>>>>> the PT >>>>>>>> >> * >>>>>>>> >> * @adev: amdgpu_device pointer >>>>>>>> >> @@ -1949,7 +1994,6 @@ int >>>>>>>> amdgpu_vm_clear_freed(struct amdgpu_device >>>>>>>> >> *adev, >>>>>>>> >> struct dma_fence **fence) >>>>>>>> >> { >>>>>>>> >> struct amdgpu_bo_va_mapping *mapping; >>>>>>>> >> - uint64_t init_pte_value = 0; >>>>>>>> >> struct dma_fence *f = NULL; >>>>>>>> >> int r; >>>>>>>> >> @@ -1958,13 +2002,10 @@ int >>>>>>>> amdgpu_vm_clear_freed(struct >>>>>>>> >> amdgpu_device *adev, >>>>>>>> >> struct amdgpu_bo_va_mapping, list); >>>>>>>> >> list_del(&mapping->list); >>>>>>>> >> - if (vm->pte_support_ats && >>>>>>>> >> - mapping->start < AMDGPU_GMC_HOLE_START) >>>>>>>> >> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC; >>>>>>>> >> + r = amdgpu_vm_remove_ptes(adev, vm, >>>>>>>> >> + (mapping->start + 0x1ff) & (~0x1ffll), >>>>>>>> >> + (mapping->last + 1) & (~0x1ffll)); >>>>>>>> >> - r = amdgpu_vm_bo_update_mapping(adev, >>>>>>>> vm, false, NULL, >>>>>>>> >> - mapping->start, mapping->last, >>>>>>>> >> - init_pte_value, 0, NULL, &f); >>>>>>>> >> amdgpu_vm_free_mapping(adev, vm, mapping, f); >>>>>>>> >> if (r) { >>>>>>>> >> dma_fence_put(f); >>>>>>>> >> @@ -1980,7 +2021,6 @@ int >>>>>>>> amdgpu_vm_clear_freed(struct amdgpu_device >>>>>>>> >> *adev, >>>>>>>> >> } >>>>>>>> >> return 0; >>>>>>>> >> - >>>>>>>> >> } >>>>>>>> >> /** >>>>>>>> > >>>>>>>> > _______________________________________________ >>>>>>>> > amd-gfx mailing list >>>>>>>> > amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx> >>>>>>>> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> amd-gfx mailing list >>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx