Hi Jerry, Let me clarify my understanding of the problem and the intention of the fix. Address range per page: 4KB Address range per PT (level 3): 2MB Address range per PD (level 2): 1GB Address range per PD (level 1): 512GB Address range per PD (level 0): 256TB Imagine for example a mapping that spans multiple level 2 page directories. Say from 0.5GB to 2.5GB. At level 0: from = 0 to = 0 At level 1: from = 0 to = 2 So for level 2 amdgpu_vm_alloc_levels gets called 3 times (pt_idx = [0..2]). Without my change, it gets called with the same saddr and eaddr 3 times. So it calculates the same "from" and "to" each time: from = 256 % 512 = 256 to = 767 % 512 = 255 Obviously that doesn't work. What we really want is this (from..to in level 2 when called for pt_idx values from level 1): For pt_idx=0 we want to fill the second half of the level 2 page directory with page tables: from = 256 to = 511 For pt_idx=1 we need to fill the entire level 2 page directories with page tables: from = 0 to = 511 For pt_idx=2 we want to fill the first half of the level 2 page directory with page tables: from = 0 to = 255 This creates a contiguous range of page tables across the three level 2 page directories that are spanned by the allocation. My change achieves that. I think you're right, there are some extra high bits in saddr and eaddr, but they get discarded by the modulo-division in the next recursion level. For added clarity the modulo division (or masking of high bits) could be moved up to the caller. Regards, Felix -- F e l i x K u e h l i n g SMTS Software Development Engineer | Vertical Workstation/Compute 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada (O) +1(289)695-1597 _ _ _ _____ _____ / \ | \ / | | _ \ \ _ | / A \ | \M/ | | |D) ) /|_| | /_/ \_\ |_| |_| |_____/ |__/ \| facebook.com/AMD | amd.com ________________________________ From: Zhang, Jerry Sent: Tuesday, March 28, 2017 10:54:03 PM To: Kuehling, Felix; amd-gfx at lists.freedesktop.org; Koenig, Christian; Zhou, David(ChunMing) Cc: Deucher, Alexander; Russell, Kent Subject: Re: [PATCH 3/3] drm/amdgpu: Fix multi-level page table bugs for large BOs On 03/29/2017 09:00 AM, Felix Kuehling wrote: > Fix the start/end address calculation for address ranges that span > multiple page directories in amdgpu_vm_alloc_levels. > > Add WARN_ONs if page tables aren't found. Otherwise the page table > update would just fail silently. > > Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 818747f..44aa039 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -278,6 +278,8 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev, > from = (saddr >> shift) % amdgpu_vm_num_entries(adev, level); > to = (eaddr >> shift) % amdgpu_vm_num_entries(adev, level); > > + WARN_ON(from > to); > + > if (to > parent->last_entry_used) > parent->last_entry_used = to; > > @@ -312,8 +314,11 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev, > } > > if (level < adev->vm_manager.num_level) { > - r = amdgpu_vm_alloc_levels(adev, vm, entry, saddr, > - eaddr, level); > + uint64_t sub_saddr = (pt_idx == from) ? saddr : 0; > + uint64_t sub_eaddr = (pt_idx == to) ? eaddr : > + ((1 << shift) - 1); > + r = amdgpu_vm_alloc_levels(adev, vm, entry, sub_saddr, > + sub_eaddr, level); Do you mean to create a full sub-pt if pt_idx != from and != to? I didn't catch it up clear. In my perspective, we may need to clear the saddr and eaddr upper level bit before going on to allocate the next level PT. Otherwise, the number of next PT entry will be incorrect, more than num_entries as the upper level PT number calculated in. e.g. there is a address contains PD + PT1 + PT2 + PT3 for the first round, we could get correct "from" and "to" address with right "shift"(3 blocks), then walk over the address space in PD range. Then for the 2nd round, we cannot get the expected "from" and "to" address with "shift"(2 blocks), as it walks over the address space in PD + PT1 range. While we'd like it's in range PT1 indeed. The fault way goes on with later round. Perhaps, that's the root cause about the issue you face. Jerry. > if (r) > return r; > } > @@ -957,9 +962,11 @@ static struct amdgpu_bo *amdgpu_vm_get_pt(struct amdgpu_pte_update_params *p, > entry = &entry->entries[idx]; > } > > - if (level) > + if (WARN_ON(level)) > return NULL; > > + WARN_ON(!entry->bo); > + > return entry->bo; > } > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170329/1fa2b02f/attachment.html>