On 17-03-29 02:52 AM, Christian König wrote: > Am 29.03.2017 um 07:58 schrieb Zhang, Jerry (Junwei): >> Hi Felix, >> >> Thanks for your illustration with patience. >> I got your meaning then, and thanks to fix that. >> >> > I think you're right, there are some extra high bits in saddr and >> eaddr, but >> > they get discarded by the modulo-division in the next recursion >> level. For >> > added clarity the modulo division (or masking of high bits) could >> be moved up >> > to the caller. >> >> It may be more clear if modulo-division called before the caller. > > Yeah, indeed that isn't easy to get on first glance. OK, I'll fix that. > > Please clean that up and also remove all those WARN_ON(), we don't > want to spam the system log with backtraces like this. I disagree. When you see the backtraces from WARN_ONs in amdgpu_vm_get_pt, it means the driver is trying to update a page table that hasn't been allocated. Without those WARN_ONs there is absolutely no warning of any kind - amdgpu_vm_update_pte just fails silently (it's a void function). Therefore I wouldn't call those warnings spam. Given that the VM PT allocation code for multi-levels is non-trivial, I think that's a good sanity check to have and really helpful to see in a log when user report random VM faults. Regards, Felix > > Regards, > Christian. > >> Anyway, >> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com> >> >> >> On 03/29/2017 11:24 AM, Kuehling, Felix wrote: >>> Hi Jerry, >>> >>> Let me clarify my understanding of the problem and the intention of >>> the fix. >>> >>> Address range per page: 4KB >>> Address range per PT (level 3): 2MB >>> Address range per PD (level 2): 1GB >>> Address range per PD (level 1): 512GB >>> Address range per PD (level 0): 256TB >>> >>> Imagine for example a mapping that spans multiple level 2 page >>> directories. Say >>> from 0.5GB to 2.5GB. >>> >>> At level 0: >>> from = 0 >>> to = 0 >>> >>> At level 1: >>> from = 0 >>> to = 2 >>> >>> So for level 2 amdgpu_vm_alloc_levels gets called 3 times (pt_idx = >>> [0..2]). >>> Without my change, it gets called with the same saddr and eaddr 3 >>> times. So it >>> calculates the same "from" and "to" each time: >>> from = 256 % 512 = 256 >>> to = 767 % 512 = 255 >>> >>> Obviously that doesn't work. What we really want is this (from..to >>> in level 2 >>> when called for pt_idx values from level 1): >>> >>> For pt_idx=0 we want to fill the second half of the level 2 page >>> directory with >>> page tables: >>> from = 256 >>> to = 511 >>> >>> For pt_idx=1 we need to fill the entire level 2 page directories >>> with page tables: >>> from = 0 >>> to = 511 >>> >>> For pt_idx=2 we want to fill the first half of the level 2 page >>> directory with >>> page tables: >>> from = 0 >>> to = 255 >>> >>> This creates a contiguous range of page tables across the three >>> level 2 page >>> directories that are spanned by the allocation. My change achieves >>> that. >>> >>> I think you're right, there are some extra high bits in saddr and >>> eaddr, but >>> they get discarded by the modulo-division in the next recursion >>> level. For >>> added clarity the modulo division (or masking of high bits) could be >>> moved up >>> to the caller. >>> >>> Regards, >>> Felix >>> >>> >>> -- >>> F e l i x K u e h l i n g >>> SMTS Software Development Engineer | Vertical Workstation/Compute >>> 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada >>> (O) +1(289)695-1597 >>> _ _ _ _____ _____ >>> / \ | \ / | | _ \ \ _ | >>> / A \ | \M/ | | |D) ) /|_| | >>> /_/ \_\ |_| |_| |_____/ |__/ \| facebook.com/AMD | amd.com >>> ------------------------------------------------------------------------------- >>> >>> *From:* Zhang, Jerry >>> *Sent:* Tuesday, March 28, 2017 10:54:03 PM >>> *To:* Kuehling, Felix; amd-gfx at lists.freedesktop.org; Koenig, >>> Christian; Zhou, >>> David(ChunMing) >>> *Cc:* Deucher, Alexander; Russell, Kent >>> *Subject:* Re: [PATCH 3/3] drm/amdgpu: Fix multi-level page table >>> bugs for >>> large BOs >>> On 03/29/2017 09:00 AM, Felix Kuehling wrote: >>>> Fix the start/end address calculation for address ranges that span >>>> multiple page directories in amdgpu_vm_alloc_levels. >>>> >>>> Add WARN_ONs if page tables aren't found. Otherwise the page table >>>> update would just fail silently. >>>> >>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++++++++++--- >>>> 1 file changed, 10 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> index 818747f..44aa039 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> @@ -278,6 +278,8 @@ static int amdgpu_vm_alloc_levels(struct >>>> amdgpu_device *adev, >>>> from = (saddr >> shift) % amdgpu_vm_num_entries(adev, level); >>>> to = (eaddr >> shift) % amdgpu_vm_num_entries(adev, level); >>>> >>>> + WARN_ON(from > to); >>>> + >>>> if (to > parent->last_entry_used) >>>> parent->last_entry_used = to; >>>> >>>> @@ -312,8 +314,11 @@ static int amdgpu_vm_alloc_levels(struct >>>> amdgpu_device *adev, >>>> } >>>> >>>> if (level < adev->vm_manager.num_level) { >>>> - r = amdgpu_vm_alloc_levels(adev, vm, entry, >>>> saddr, >>>> - eaddr, level); >>>> + uint64_t sub_saddr = (pt_idx == from) ? saddr >>>> : 0; >>>> + uint64_t sub_eaddr = (pt_idx == to) ? eaddr : >>>> + ((1 << shift) - 1); >>>> + r = amdgpu_vm_alloc_levels(adev, vm, entry, >>>> sub_saddr, >>>> + sub_eaddr, level); >>> >>> Do you mean to create a full sub-pt if pt_idx != from and != to? >>> I didn't catch it up clear. >>> >>> In my perspective, we may need to clear the saddr and eaddr upper >>> level bit >>> before going on to allocate the next level PT. >>> Otherwise, the number of next PT entry will be incorrect, more than >>> num_entries >>> as the upper level PT number calculated in. >>> >>> e.g. >>> there is a address contains >>> PD + PT1 + PT2 + PT3 >>> >>> for the first round, we could get correct "from" and "to" address >>> with right >>> "shift"(3 blocks), then walk over the address space in PD range. >>> >>> Then for the 2nd round, we cannot get the expected "from" and "to" >>> address with >>> "shift"(2 blocks), as it walks over the address space in PD + PT1 >>> range. >>> While we'd like it's in range PT1 indeed. >>> >>> The fault way goes on with later round. >>> >>> Perhaps, that's the root cause about the issue you face. >>> >>> Jerry. >>> >>>> if (r) >>>> return r; >>>> } >>>> @@ -957,9 +962,11 @@ static struct amdgpu_bo >>>> *amdgpu_vm_get_pt(struct amdgpu_pte_update_params *p, >>>> entry = &entry->entries[idx]; >>>> } >>>> >>>> - if (level) >>>> + if (WARN_ON(level)) >>>> return NULL; >>>> >>>> + WARN_ON(!entry->bo); >>>> + >>>> return entry->bo; >>>> } >>>> >>>> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > >