Re: [PATCH 3/6] drm/amdgpu: allocate VM PDs/PTs on demand

"Kuehling, Felix" <Felix.Kuehling@xxxxxxx> · Tue, 12 Mar 2019 20:25:23 +0000

The root cause is that we don't wait after calling amdgpu_vm_clear_bo in 
amdgpu_vm_alloc_pts.

Waiting for the page table BOs to be idle for CPU page table updates is 
done in amdgpu_vm_bo_update_mapping. That is now *before* the page 
tables are actually allocated and cleared in amdgpu_vm_update_ptes.

We'll need to move the waiting for page tables to be idle into 
amdgpu_vm_alloc_pts or amdgpu_vm_update_ptes.

Regards,
   Felix

On 2019-03-12 3:02 p.m., Felix Kuehling wrote:
> I find that it's related to CPU page table updates. If I force page 
> table updates with SDMA, I don't get the VM fault.
>
> Regards,
>   Felix
>
> On 2019-03-11 12:55 p.m., Christian König wrote:
>> Hi guys,
>>
>> well it's most likely some missing handling in the KFD, so I'm rather 
>> reluctant to revert the change immediately.
>>
>> Problem is that I don't have time right now to look into it 
>> immediately. So Kent can you continue to take a look?
>>
>> Sounds like its crashing immediately, so it should be something obvious.
>>
>> Christian.
>>
>> Am 11.03.19 um 10:49 schrieb Russell, Kent:
>>>  From what I've been able to dig through, the VM Fault seems to 
>>> occur right after a doorbell mmap, but that's as far as I got. I can 
>>> try to revert it in today's merge and see how things go.
>>>
>>>   Kent
>>>
>>>> -----Original Message-----
>>>> From: Kuehling, Felix
>>>> Sent: Friday, March 08, 2019 11:16 PM
>>>> To: Koenig, Christian <Christian.Koenig@xxxxxxx>; Russell, Kent
>>>> <Kent.Russell@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>> Subject: RE: [PATCH 3/6] drm/amdgpu: allocate VM PDs/PTs on demand
>>>>
>>>> My concerns were related to eviction fence handing. It would 
>>>> manifest by
>>>> unnecessary eviction callbacks into KFD that aren't cause by real 
>>>> evictions. I
>>>> addressed that with a previous patch series that removed the need to
>>>> remove eviction fences and add them back around page table updates in
>>>> amdgpu_amdkfd_gpuvm.c.
>>>>
>>>> I don't know what's going on here. I can probably take a look on 
>>>> Monday. I
>>>> haven't considered what changed with respect to PD updates.
>>>>
>>>> Kent, can we temporarily revert the offending change in 
>>>> amd-kfd-staging
>>>> just to unblock the merge?
>>>>
>>>> Christian, I think KFD is currently broken on amd-staging-drm-next. 
>>>> If we're
>>>> serious about supporting KFD upstream, you may also want to consider
>>>> reverting your change there for now. Also consider building the 
>>>> Thunk and
>>>> kfdtest so you can do quick smoke tests locally whenever you make
>>>> amdgpu_vm changes that can affect KFD.
>>>> https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
>>>>
>>>> Regards,
>>>>    Felix
>>>>
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of
>>>> Christian König
>>>> Sent: Friday, March 08, 2019 9:14 AM
>>>> To: Russell, Kent <Kent.Russell@xxxxxxx>; 
>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>> Subject: Re: [PATCH 3/6] drm/amdgpu: allocate VM PDs/PTs on demand
>>>>
>>>> My best guess is that we forget somewhere to update the PDs. What
>>>> hardware is that on?
>>>>
>>>> Felix already mentioned that this could be problematic for the KFD.
>>>>
>>>> Maybe he has an idea,
>>>> Christian.
>>>>
>>>> Am 08.03.19 um 15:04 schrieb Russell, Kent:
>>>>> Hi Christian,
>>>>>
>>>>> This patch ended up causing a VM Fault in KFDTest. Reverting just 
>>>>> this
>>>> patch addressed the issue:
>>>>> [   82.703503] amdgpu 0000:0c:00.0: GPU fault detected: 146 
>>>>> 0x0000480c for
>>>> process  pid 0 thread  pid 0
>>>>> [   82.703512] amdgpu 0000:0c:00.0:
>>>> VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00001000
>>>>> [   82.703516] amdgpu 0000:0c:00.0:
>>>> VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x1004800C
>>>>> [   82.703522] amdgpu 0000:0c:00.0: VM fault (0x0c, vmid 8, pasid 
>>>>> 32769) at
>>>> page 4096, read from 'TC0' (0x54433000) (72)
>>>>> [   82.703585] Evicting PASID 32769 queues
>>>>>
>>>>> I am looking into it, but if you have any insight that would be 
>>>>> great in
>>>> helping to resolve it quickly.
>>>>>    Kent
>>>>>> -----Original Message-----
>>>>>> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of
>>>>>> Christian König
>>>>>> Sent: Tuesday, February 26, 2019 7:47 AM
>>>>>> To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>>>> Subject: [PATCH 3/6] drm/amdgpu: allocate VM PDs/PTs on demand
>>>>>>
>>>>>> Let's start to allocate VM PDs/PTs on demand instead of
>>>>>> pre-allocating them during mapping.
>>>>>>
>>>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx>
>>>>>> Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
>>>>>> ---
>>>>>>    .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  10 +-
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |   9 --
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  10 --
>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 136 
>>>>>> +++++------------
>>>> -
>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |   3 -
>>>>>>    5 files changed, 39 insertions(+), 129 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> index 31e3953dcb6e..088e9b6b765b 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> @@ -410,15 +410,7 @@ static int add_bo_to_vm(struct amdgpu_device
>>>>>> *adev, struct kgd_mem *mem,
>>>>>>        if (p_bo_va_entry)
>>>>>>            *p_bo_va_entry = bo_va_entry;
>>>>>>
>>>>>> -    /* Allocate new page tables if needed and validate
>>>>>> -     * them.
>>>>>> -     */
>>>>>> -    ret = amdgpu_vm_alloc_pts(adev, vm, va, amdgpu_bo_size(bo));
>>>>>> -    if (ret) {
>>>>>> -        pr_err("Failed to allocate pts, err=%d\n", ret);
>>>>>> -        goto err_alloc_pts;
>>>>>> -    }
>>>>>> -
>>>>>> +    /* Allocate validate page tables if needed */
>>>>>>        ret = vm_validate_pt_pd_bos(vm);
>>>>>>        if (ret) {
>>>>>>            pr_err("validate_pt_pd_bos() failed\n"); diff --git
>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>>>> index 7e22be7ca68a..54dd02a898b9 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>>>> @@ -92,15 +92,6 @@ int amdgpu_map_static_csa(struct amdgpu_device
>>>>>> *adev, struct amdgpu_vm *vm,
>>>>>>            return -ENOMEM;
>>>>>>        }
>>>>>>
>>>>>> -    r = amdgpu_vm_alloc_pts(adev, (*bo_va)->base.vm, csa_addr,
>>>>>> -                size);
>>>>>> -    if (r) {
>>>>>> -        DRM_ERROR("failed to allocate pts for static CSA, 
>>>>>> err=%d\n",
>>>>>> r);
>>>>>> -        amdgpu_vm_bo_rmv(adev, *bo_va);
>>>>>> -        ttm_eu_backoff_reservation(&ticket, &list);
>>>>>> -        return r;
>>>>>> -    }
>>>>>> -
>>>>>>        r = amdgpu_vm_bo_map(adev, *bo_va, csa_addr, 0, size,
>>>>>>                     AMDGPU_PTE_READABLE |
>>>>>> AMDGPU_PTE_WRITEABLE |
>>>>>>                     AMDGPU_PTE_EXECUTABLE);
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> index 555285e329ed..fcaaac30e84b 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>>>> @@ -625,11 +625,6 @@ int amdgpu_gem_va_ioctl(struct drm_device
>>>> *dev,
>>>>>> void *data,
>>>>>>
>>>>>>        switch (args->operation) {
>>>>>>        case AMDGPU_VA_OP_MAP:
>>>>>> -        r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args-
>>>>>>> va_address,
>>>>>> -                    args->map_size);
>>>>>> -        if (r)
>>>>>> -            goto error_backoff;
>>>>>> -
>>>>>>            va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
>>>>>>            r = amdgpu_vm_bo_map(adev, bo_va, args->va_address,
>>>>>>                         args->offset_in_bo, args->map_size, @@ -
>>>>>> 645,11 +640,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev,
>>>> void
>>>>>> *data,
>>>>>>                            args->map_size);
>>>>>>            break;
>>>>>>        case AMDGPU_VA_OP_REPLACE:
>>>>>> -        r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args-
>>>>>>> va_address,
>>>>>> -                    args->map_size);
>>>>>> -        if (r)
>>>>>> -            goto error_backoff;
>>>>>> -
>>>>>>            va_flags = amdgpu_gmc_get_pte_flags(adev, args->flags);
>>>>>>            r = amdgpu_vm_bo_replace_map(adev, bo_va, args-
>>>>>>> va_address,
>>>>>>                             args->offset_in_bo, args-
>>>>>>> map_size, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> index 362436f4e856..dfad543fc000 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> @@ -504,47 +504,6 @@ static void amdgpu_vm_pt_next(struct
>>>>>> amdgpu_device *adev,
>>>>>>        }
>>>>>>    }
>>>>>>
>>>>>> -/**
>>>>>> - * amdgpu_vm_pt_first_leaf - get first leaf PD/PT
>>>>>> - *
>>>>>> - * @adev: amdgpu_device pointer
>>>>>> - * @vm: amdgpu_vm structure
>>>>>> - * @start: start addr of the walk
>>>>>> - * @cursor: state to initialize
>>>>>> - *
>>>>>> - * Start a walk and go directly to the leaf node.
>>>>>> - */
>>>>>> -static void amdgpu_vm_pt_first_leaf(struct amdgpu_device *adev,
>>>>>> -                    struct amdgpu_vm *vm, uint64_t start,
>>>>>> -                    struct amdgpu_vm_pt_cursor *cursor)
>>>>>> -{
>>>>>> -    amdgpu_vm_pt_start(adev, vm, start, cursor);
>>>>>> -    while (amdgpu_vm_pt_descendant(adev, cursor));
>>>>>> -}
>>>>>> -
>>>>>> -/**
>>>>>> - * amdgpu_vm_pt_next_leaf - get next leaf PD/PT
>>>>>> - *
>>>>>> - * @adev: amdgpu_device pointer
>>>>>> - * @cursor: current state
>>>>>> - *
>>>>>> - * Walk the PD/PT tree to the next leaf node.
>>>>>> - */
>>>>>> -static void amdgpu_vm_pt_next_leaf(struct amdgpu_device *adev,
>>>>>> -                   struct amdgpu_vm_pt_cursor *cursor)
>>>>>> -{
>>>>>> -    amdgpu_vm_pt_next(adev, cursor);
>>>>>> -    if (cursor->pfn != ~0ll)
>>>>>> -        while (amdgpu_vm_pt_descendant(adev, cursor));
>>>>>> -}
>>>>>> -
>>>>>> -/**
>>>>>> - * for_each_amdgpu_vm_pt_leaf - walk over all leaf PDs/PTs in the
>>>>>> hierarchy
>>>>>> - */
>>>>>> -#define for_each_amdgpu_vm_pt_leaf(adev, vm, start, end, cursor)
>>>>>>     \
>>>>>> -    for (amdgpu_vm_pt_first_leaf((adev), (vm), (start), &(cursor));
>>>>>>         \
>>>>>> -         (cursor).pfn <= end; amdgpu_vm_pt_next_leaf((adev),
>>>>>> &(cursor)))
>>>>>> -
>>>>>>    /**
>>>>>>     * amdgpu_vm_pt_first_dfs - start a deep first search
>>>>>>     *
>>>>>> @@ -915,74 +874,51 @@ static void amdgpu_vm_bo_param(struct
>>>>>> amdgpu_device *adev, struct amdgpu_vm *vm,
>>>>>>     * Returns:
>>>>>>     * 0 on success, errno otherwise.
>>>>>>     */
>>>>>> -int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
>>>>>> -            struct amdgpu_vm *vm,
>>>>>> -            uint64_t saddr, uint64_t size)
>>>>>> +static int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
>>>>>> +                   struct amdgpu_vm *vm,
>>>>>> +                   struct amdgpu_vm_pt_cursor *cursor)
>>>>>>    {
>>>>>> -    struct amdgpu_vm_pt_cursor cursor;
>>>>>> +    struct amdgpu_vm_pt *entry = cursor->entry;
>>>>>> +    struct amdgpu_bo_param bp;
>>>>>>        struct amdgpu_bo *pt;
>>>>>> -    uint64_t eaddr;
>>>>>>        int r;
>>>>>>
>>>>>> -    /* validate the parameters */
>>>>>> -    if (saddr & AMDGPU_GPU_PAGE_MASK || size &
>>>>>> AMDGPU_GPU_PAGE_MASK)
>>>>>> -        return -EINVAL;
>>>>>> +    if (cursor->level < AMDGPU_VM_PTB && !entry->entries) {
>>>>>> +        unsigned num_entries;
>>>>>>
>>>>>> -    eaddr = saddr + size - 1;
>>>>>> -
>>>>>> -    saddr /= AMDGPU_GPU_PAGE_SIZE;
>>>>>> -    eaddr /= AMDGPU_GPU_PAGE_SIZE;
>>>>>> -
>>>>>> -    if (eaddr >= adev->vm_manager.max_pfn) {
>>>>>> -        dev_err(adev->dev, "va above limit (0x%08llX >=
>>>>>> 0x%08llX)\n",
>>>>>> -            eaddr, adev->vm_manager.max_pfn);
>>>>>> -        return -EINVAL;
>>>>>> +        num_entries = amdgpu_vm_num_entries(adev, cursor-
>>>>>>> level);
>>>>>> +        entry->entries = kvmalloc_array(num_entries,
>>>>>> +                        sizeof(*entry->entries),
>>>>>> +                        GFP_KERNEL | __GFP_ZERO);
>>>>>> +        if (!entry->entries)
>>>>>> +            return -ENOMEM;
>>>>>>        }
>>>>>>
>>>>>> -    for_each_amdgpu_vm_pt_leaf(adev, vm, saddr, eaddr, cursor) {
>>>>>> -        struct amdgpu_vm_pt *entry = cursor.entry;
>>>>>> -        struct amdgpu_bo_param bp;
>>>>>> -
>>>>>> -        if (cursor.level < AMDGPU_VM_PTB) {
>>>>>> -            unsigned num_entries;
>>>>>> -
>>>>>> -            num_entries = amdgpu_vm_num_entries(adev,
>>>>>> cursor.level);
>>>>>> -            entry->entries = kvmalloc_array(num_entries,
>>>>>> -                            sizeof(*entry-
>>>>>>> entries),
>>>>>> -                            GFP_KERNEL |
>>>>>> -                            __GFP_ZERO);
>>>>>> -            if (!entry->entries)
>>>>>> -                return -ENOMEM;
>>>>>> -        }
>>>>>> -
>>>>>> -
>>>>>> -        if (entry->base.bo)
>>>>>> -            continue;
>>>>>> -
>>>>>> -        amdgpu_vm_bo_param(adev, vm, cursor.level, &bp);
>>>>>> -
>>>>>> -        r = amdgpu_bo_create(adev, &bp, &pt);
>>>>>> -        if (r)
>>>>>> -            return r;
>>>>>> -
>>>>>> -        if (vm->use_cpu_for_update) {
>>>>>> -            r = amdgpu_bo_kmap(pt, NULL);
>>>>>> -            if (r)
>>>>>> -                goto error_free_pt;
>>>>>> -        }
>>>>>> +    if (entry->base.bo)
>>>>>> +        return 0;
>>>>>>
>>>>>> -        /* Keep a reference to the root directory to avoid
>>>>>> -        * freeing them up in the wrong order.
>>>>>> -        */
>>>>>> -        pt->parent = amdgpu_bo_ref(cursor.parent->base.bo);
>>>>>> +    amdgpu_vm_bo_param(adev, vm, cursor->level, &bp);
>>>>>>
>>>>>> -        amdgpu_vm_bo_base_init(&entry->base, vm, pt);
>>>>>> +    r = amdgpu_bo_create(adev, &bp, &pt);
>>>>>> +    if (r)
>>>>>> +        return r;
>>>>>>
>>>>>> -        r = amdgpu_vm_clear_bo(adev, vm, pt);
>>>>>> +    if (vm->use_cpu_for_update) {
>>>>>> +        r = amdgpu_bo_kmap(pt, NULL);
>>>>>>            if (r)
>>>>>>                goto error_free_pt;
>>>>>>        }
>>>>>>
>>>>>> +    /* Keep a reference to the root directory to avoid
>>>>>> +     * freeing them up in the wrong order.
>>>>>> +     */
>>>>>> +    pt->parent = amdgpu_bo_ref(cursor->parent->base.bo);
>>>>>> +    amdgpu_vm_bo_base_init(&entry->base, vm, pt);
>>>>>> +
>>>>>> +    r = amdgpu_vm_clear_bo(adev, vm, pt);
>>>>>> +    if (r)
>>>>>> +        goto error_free_pt;
>>>>>> +
>>>>>>        return 0;
>>>>>>
>>>>>>    error_free_pt:
>>>>>> @@ -1627,6 +1563,7 @@ static int amdgpu_vm_update_ptes(struct
>>>>>> amdgpu_pte_update_params *params,
>>>>>>        struct amdgpu_vm_pt_cursor cursor;
>>>>>>        uint64_t frag_start = start, frag_end;
>>>>>>        unsigned int frag;
>>>>>> +    int r;
>>>>>>
>>>>>>        /* figure out the initial fragment */
>>>>>>        amdgpu_vm_fragment(params, frag_start, end, flags, &frag,
>>>>>> &frag_end); @@ -1634,12 +1571,15 @@ static int
>>>>>> amdgpu_vm_update_ptes(struct amdgpu_pte_update_params *params,
>>>>>>        /* walk over the address space and update the PTs */
>>>>>>        amdgpu_vm_pt_start(adev, params->vm, start, &cursor);
>>>>>>        while (cursor.pfn < end) {
>>>>>> -        struct amdgpu_bo *pt = cursor.entry->base.bo;
>>>>>>            unsigned shift, parent_shift, mask;
>>>>>>            uint64_t incr, entry_end, pe_start;
>>>>>> +        struct amdgpu_bo *pt;
>>>>>>
>>>>>> -        if (!pt)
>>>>>> -            return -ENOENT;
>>>>>> +        r = amdgpu_vm_alloc_pts(params->adev, params->vm,
>>>>>> &cursor);
>>>>>> +        if (r)
>>>>>> +            return r;
>>>>>> +
>>>>>> +        pt = cursor.entry->base.bo;
>>>>>>
>>>>>>            /* The root level can't be a huge page */
>>>>>>            if (cursor.level == adev->vm_manager.root_level) { 
>>>>>> diff --git
>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>> index 81ff8177f092..116605c038d2 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>> @@ -303,9 +303,6 @@ bool amdgpu_vm_ready(struct amdgpu_vm *vm);
>>>> int
>>>>>> amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct
>>>>>> amdgpu_vm *vm,
>>>>>>                      int (*callback)(void *p, struct amdgpu_bo *bo),
>>>>>>                      void *param);
>>>>>> -int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
>>>>>> -            struct amdgpu_vm *vm,
>>>>>> -            uint64_t saddr, uint64_t size);
>>>>>>    int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job
>>>>>> *job, bool need_pipe_sync);  int amdgpu_vm_update_directories(struct
>>>>>> amdgpu_device *adev,
>>>>>>                     struct amdgpu_vm *vm);
>>>>>> -- 
>>>>>> 2.17.1
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx