On 08/30/2018 02:48 PM, Christian König wrote: > Am 30.08.2018 um 04:43 schrieb Zhang, Jerry (Junwei): >> On 08/29/2018 05:39 PM, Christian König wrote: >>> Am 29.08.2018 um 04:03 schrieb Zhang, Jerry (Junwei): >>>> On 08/28/2018 08:17 PM, Christian König wrote: >>>>> Correct sign extend the GMC addresses to 48bit. >>>> >>>> Could you explain a bit more why to extend the sign? >>> >>> The hardware works like this, in other words when bit 47 is set we must extend that into bits 48-63. >> >> Thanks. fine. >> >>> >>>> the address is uint64_t. is if failed in some case? >>> >>> What do you mean? >> >> Sorry for the typo without finishing the sentence before sending. >> >> I mean even if the address is uint64_t, still needs to extend the sign? >> what I was thinking is that the int64_t needs to do this. > > Well, no. What we would need is an int48_t type, but such thing doesn't exists and isn't easily implementable in C. If so, it would be better to understand. Thanks. > >> >>> >>>> >>>> > -/* VA hole for 48bit addresses on Vega10 */ >>>> > -#define AMDGPU_VA_HOLE_START 0x0000800000000000ULL >>>> > -#define AMDGPU_VA_HOLE_END 0xffff800000000000ULL >>>> >>>> BTW, the hole for 48bit is actually 47 bit left, any background for that? >>> >>> Well bits start counting at zero. So the 48bit addresses have bits 0-47. >> >> The VA hole is going to catch the VA address out of normal range, which for vega10 is 48-bit? > > Yes, exactly. > >> if so, 0x8000_0000_0000 ULL holds from 0~46 bits, starting from 128TB, but vega10 VA is 256TB > > Correct, the lower range is from 0x0-0x8000_0000_0000 and the higher range is from 0xffff_8000_0000_0000-0xffff_ffff_ffff_ffff. > >> >> it also could be found in old code gmc_v9.c, the mc_mask holds 48-bits address, like below: >> >> adev->gmc.mc_mask = 0xffff_ffff_ffff ULL; /* 48 bit MC */ >> >> But the VA hole start address is 0x8000_0000_0000 ULL, then libdrm gets virtual_address_max >> >> dev_info.virtual_address_max = min(vm_size, AMDGPU_VA_HOLE_START) >> // that's 0x8000_0000_0000 ULL actually > > We limit the reported VA size for backward compatibility with old userspace here. fine, got it. Thanks. Regards, Jerry > >> >> Above all, it looks the VA hole start should be 0x1_0000_0000_0000 UL. > > Nope, that isn't correct. The hole is between 0x8000_0000_0000 and 0xffff_8000_0000_0000. > > Regards, > Christian. > >> >> Regards, >> Jerry >> >>> >>> Regards, >>> Christian. >>> >>>> >>>> Regards, >>>> Jerry >>>> >>>>> >>>>> v2: sign extending turned out easier than thought. >>>>> v3: clean up the defines and move them into amdgpu_gmc.h as well >>>>> >>>>> Signed-off-by: Christian König <christian.koenig at amd.com> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 10 ++++----- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 26 ++++++++++++++++++++++ >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 +++---- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 6 ++--- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++--- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 13 ----------- >>>>> 9 files changed, 44 insertions(+), 32 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> index 8c652ecc4f9a..bc5ccfca68c5 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> @@ -135,7 +135,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) >>>>> .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe, >>>>> .gpuvm_size = min(adev->vm_manager.max_pfn >>>>> << AMDGPU_GPU_PAGE_SHIFT, >>>>> - AMDGPU_VA_HOLE_START), >>>>> + AMDGPU_GMC_HOLE_START), >>>>> .drm_render_minor = adev->ddev->render->index >>>>> }; >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> index dd734970e167..ef2bfc04b41c 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> @@ -835,7 +835,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) >>>>> if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB) >>>>> continue; >>>>> >>>>> - va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK; >>>>> + va_start = chunk_ib->va_start & AMDGPU_GMC_HOLE_MASK; >>>>> r = amdgpu_cs_find_mapping(p, va_start, &aobj, &m); >>>>> if (r) { >>>>> DRM_ERROR("IB va_start is invalid\n"); >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>> index 71792d820ae0..d30a0838851b 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c >>>>> @@ -572,16 +572,16 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, >>>>> return -EINVAL; >>>>> } >>>>> >>>>> - if (args->va_address >= AMDGPU_VA_HOLE_START && >>>>> - args->va_address < AMDGPU_VA_HOLE_END) { >>>>> + if (args->va_address >= AMDGPU_GMC_HOLE_START && >>>>> + args->va_address < AMDGPU_GMC_HOLE_END) { >>>>> dev_dbg(&dev->pdev->dev, >>>>> "va_address 0x%LX is in VA hole 0x%LX-0x%LX\n", >>>>> - args->va_address, AMDGPU_VA_HOLE_START, >>>>> - AMDGPU_VA_HOLE_END); >>>>> + args->va_address, AMDGPU_GMC_HOLE_START, >>>>> + AMDGPU_GMC_HOLE_END); >>>>> return -EINVAL; >>>>> } >>>>> >>>>> - args->va_address &= AMDGPU_VA_HOLE_MASK; >>>>> + args->va_address &= AMDGPU_GMC_HOLE_MASK; >>>>> >>>>> if ((args->flags & ~valid_flags) && (args->flags & ~prt_flags)) { >>>>> dev_dbg(&dev->pdev->dev, "invalid flags combination 0x%08X\n", >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >>>>> index 0d2c9f65ca13..9d9c7a9f54e4 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >>>>> @@ -30,6 +30,19 @@ >>>>> >>>>> #include "amdgpu_irq.h" >>>>> >>>>> +/* VA hole for 48bit addresses on Vega10 */ >>>>> +#define AMDGPU_GMC_HOLE_START 0x0000800000000000ULL >>>>> +#define AMDGPU_GMC_HOLE_END 0xffff800000000000ULL >>>>> + >>>>> +/* >>>>> + * Hardware is programmed as if the hole doesn't exists with start and end >>>>> + * address values. >>>>> + * >>>>> + * This mask is used to remove the upper 16bits of the VA and so come up with >>>>> + * the linear addr value. >>>>> + */ >>>>> +#define AMDGPU_GMC_HOLE_MASK 0x0000ffffffffffffULL >>>>> + >>>>> struct firmware; >>>>> >>>>> /* >>>>> @@ -131,6 +144,19 @@ static inline bool amdgpu_gmc_vram_full_visible(struct amdgpu_gmc *gmc) >>>>> return (gmc->real_vram_size == gmc->visible_vram_size); >>>>> } >>>>> >>>>> +/** >>>>> + * amdgpu_gmc_sign_extend - sign extend the given gmc address >>>>> + * >>>>> + * @addr: address to extend >>>>> + */ >>>>> +static inline uint64_t amdgpu_gmc_sign_extend(uint64_t addr) >>>>> +{ >>>>> + if (addr >= AMDGPU_GMC_HOLE_START) >>>>> + addr |= AMDGPU_GMC_HOLE_END; >>>>> + >>>>> + return addr; >>>>> +} >>>>> + >>>>> void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level, >>>>> uint64_t *addr, uint64_t *flags); >>>>> uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo); >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c >>>>> index 9c4e45936ade..29ac3873eeb0 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c >>>>> @@ -655,11 +655,11 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file >>>>> >>>>> dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE; >>>>> dev_info.virtual_address_max = >>>>> - min(vm_size, AMDGPU_VA_HOLE_START); >>>>> + min(vm_size, AMDGPU_GMC_HOLE_START); >>>>> >>>>> - if (vm_size > AMDGPU_VA_HOLE_START) { >>>>> - dev_info.high_va_offset = AMDGPU_VA_HOLE_END; >>>>> - dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size; >>>>> + if (vm_size > AMDGPU_GMC_HOLE_START) { >>>>> + dev_info.high_va_offset = AMDGPU_GMC_HOLE_END; >>>>> + dev_info.high_va_max = AMDGPU_GMC_HOLE_END | vm_size; >>>>> } >>>>> dev_info.virtual_address_alignment = max((int)PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE); >>>>> dev_info.pte_fragment_size = (1 << adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE; >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>>>> index 5ddd4e87480b..9c3cc23488c2 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>>>> @@ -1370,7 +1370,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo) >>>>> WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM && >>>>> !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)); >>>>> >>>>> - return bo->tbo.offset; >>>>> + return amdgpu_gmc_sign_extend(bo->tbo.offset); >>>>> } >>>>> >>>>> /** >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c >>>>> index 38856365580d..f2f358aa0597 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c >>>>> @@ -28,9 +28,7 @@ uint64_t amdgpu_csa_vaddr(struct amdgpu_device *adev) >>>>> uint64_t addr = adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT; >>>>> >>>>> addr -= AMDGPU_VA_RESERVED_SIZE; >>>>> - >>>>> - if (addr >= AMDGPU_VA_HOLE_START) >>>>> - addr |= AMDGPU_VA_HOLE_END; >>>>> + addr = amdgpu_gmc_sign_extend(addr); >>>>> >>>>> return addr; >>>>> } >>>>> @@ -73,7 +71,7 @@ void amdgpu_free_static_csa(struct amdgpu_device *adev) { >>>>> int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, >>>>> struct amdgpu_bo_va **bo_va) >>>>> { >>>>> - uint64_t csa_addr = amdgpu_csa_vaddr(adev) & AMDGPU_VA_HOLE_MASK; >>>>> + uint64_t csa_addr = amdgpu_csa_vaddr(adev) & AMDGPU_GMC_HOLE_MASK; >>>>> struct ww_acquire_ctx ticket; >>>>> struct list_head list; >>>>> struct amdgpu_bo_list_entry pd; >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> index 1f4b8dfc0456..8bbba0127e42 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> @@ -399,7 +399,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, >>>>> if (level == adev->vm_manager.root_level) { >>>>> ats_entries = amdgpu_vm_level_shift(adev, level); >>>>> ats_entries += AMDGPU_GPU_PAGE_SHIFT; >>>>> - ats_entries = AMDGPU_VA_HOLE_START >> ats_entries; >>>>> + ats_entries = AMDGPU_GMC_HOLE_START >> ats_entries; >>>>> ats_entries = min(ats_entries, entries); >>>>> entries -= ats_entries; >>>>> } else { >>>>> @@ -629,7 +629,7 @@ int amdgpu_vm_alloc_pts(struct amdgpu_device *adev, >>>>> eaddr = saddr + size - 1; >>>>> >>>>> if (vm->pte_support_ats) >>>>> - ats = saddr < AMDGPU_VA_HOLE_START; >>>>> + ats = saddr < AMDGPU_GMC_HOLE_START; >>>>> >>>>> saddr /= AMDGPU_GPU_PAGE_SIZE; >>>>> eaddr /= AMDGPU_GPU_PAGE_SIZE; >>>>> @@ -1934,7 +1934,8 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, >>>>> struct amdgpu_bo_va_mapping, list); >>>>> list_del(&mapping->list); >>>>> >>>>> - if (vm->pte_support_ats && mapping->start < AMDGPU_VA_HOLE_START) >>>>> + if (vm->pte_support_ats && >>>>> + mapping->start < AMDGPU_GMC_HOLE_START) >>>>> init_pte_value = AMDGPU_PTE_DEFAULT_ATC; >>>>> >>>>> r = amdgpu_vm_bo_update_mapping(adev, NULL, NULL, vm, >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>>> index 94fe47890adf..2ce452198017 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>>> @@ -101,19 +101,6 @@ struct amdgpu_bo_list_entry; >>>>> /* hardcode that limit for now */ >>>>> #define AMDGPU_VA_RESERVED_SIZE (1ULL << 20) >>>>> >>>>> -/* VA hole for 48bit addresses on Vega10 */ >>>>> -#define AMDGPU_VA_HOLE_START 0x0000800000000000ULL >>>>> -#define AMDGPU_VA_HOLE_END 0xffff800000000000ULL >>>>> - >>>>> -/* >>>>> - * Hardware is programmed as if the hole doesn't exists with start and end >>>>> - * address values. >>>>> - * >>>>> - * This mask is used to remove the upper 16bits of the VA and so come up with >>>>> - * the linear addr value. >>>>> - */ >>>>> -#define AMDGPU_VA_HOLE_MASK 0x0000ffffffffffffULL >>>>> - >>>>> /* max vmids dedicated for process */ >>>>> #define AMDGPU_VM_MAX_RESERVED_VMID 1 >>>>> >>>>> >>> >