Support for amdgpu VM update via CPU on large-bar systems

deathsimple@xxxxxxxxxxx (Christian König) · Wed, 10 May 2017 09:23:59 +0200

Hi Harish,

for the next time please use git send-email to send patches to the 
mailing list.

Looks pretty good in general, but a few notes all over the place.

Patch #1:
> +static bool amdgpu_vm_is_large_bar(struct amdgpu_device *adev)
> +{
> +    if (adev->mc.visible_vram_size > 0 &&
> +            adev->mc.real_vram_size == adev->mc.visible_vram_size)
> +        return true;
> +    return false;
> +}
The visible_vram_size > 0 check looks superfluous. The coding style 
looks off, the second line of the "if" is to far right.

And in general that should rather look like "return 
adev->mc.real_vram_size == adev->mc.visible_vram_size;".

> +    /* CPU update is only supported for large BAR system */
> +    vm->is_vm_update_mode_cpu = (vm_update_mode &&
> +                     amdgpu_vm_is_large_bar(adev));
Mhm, it would be nice if we could enable this for testing even on 
systems with small BAR.

I would rather suggest to make the default value depend on if the BOX 
has a large or small BAR and let the user override the setting.

Additional to that we should limit this to 64bit systems.

> +#define AMDGPU_VM_USE_CPU_UPDATE_FOR_GRAPHICS_MASK (1 << 0)
> +#define AMDGPU_VM_USE_CPU_UPDATE_FOR_COMPUTE_MASK (1 << 1)
Not much of an issue, but the names are a bit long. Maybe use something 
like AMDGPU_VM_USE_CPU_FOR_GFX and AMDGPU_VM_USE_CPU_FOR_COMPUTE.

> +    bool is_vm_update_mode_cpu : 1;
Please no bitmasks when we don't need them.

Patch #2:
> +    u64 flags;
>      unsigned shift = (adev->vm_manager.num_level - level) *
>          adev->vm_manager.block_size;
Reverse tree order.

> +    flags = (vm->is_vm_update_mode_cpu) ?
> +            AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED :
> +            AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
> +            AMDGPU_GEM_CREATE_SHADOW;
> +
> +
...
> - AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
> -                         AMDGPU_GEM_CREATE_SHADOW |
> +                         flags |
>                           AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
>                           AMDGPU_GEM_CREATE_VRAM_CLEARED,

I would rather write it like this:

flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | AMDGPU_GEM_CREATE_VRAM_CLEARED;
if (vm->is_vm_update_mode_cpu)
     flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
else
     flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS | AMDGPU_GEM_CREATE_SHADOW;

> +    mb();
> +    amdgpu_gart_flush_gpu_tlb(params->adev, 0);
You do this for the HDP flush, don't you?

> +            dev_warn(adev->dev, "Page table update using CPU failed. 
> Fallback to SDMA\n");
NACK, if kmapping the BO fails we most likely are either out of memory 
or out of address space.

Falling back to the SDMA doesn't make the situation better, just return 
a proper error code here.

> +            /* Wait for BO to be free. SDMA could be clearing it */
> +            amdgpu_sync_create(&sync);
> +            amdgpu_sync_resv(adev, &sync, parent->bo->tbo.resv,
> +                     AMDGPU_FENCE_OWNER_VM);
> +            amdgpu_sync_wait(&sync);
> +            amdgpu_sync_free(&sync);
Superfluous and a bit incorrect, amdgpu_bo_kmap already calls 
reservation_object_wait_timeout_rcu() but only for the exclusive fence.

You probably ran into problems because of pipelined evictions, so that 
should be fixed in amdgpu_bo_kmap() instead.

> -  amdgpu_vm_do_set_ptes(&params,
> -                                  last_shadow,
> -                                  pt_addr, count,
> -                                  incr,
> -                                  AMDGPU_PTE_VALID);
> -
> -                amdgpu_vm_do_set_ptes(&params, last_pde,
> -                              pt_addr, count, incr,
> -                              AMDGPU_PTE_VALID);
> +                    params.func(&params,
> +                            last_shadow,
> +                            pt_addr, count,
> +                            incr,
> +                            AMDGPU_PTE_VALID);
> +
> +                params.func(&params, last_pde,
> +                        pt_addr, count, incr,
> +                        AMDGPU_PTE_VALID);
Might be a good idea to separate that into a cleanup patch.

Patch #3: Looks good to me and is Reviewed-by: Christian KÃ¶nig 
<christian.koenig at amd.com>.

Please move that one to the beginning of the series and/or commit it 
directly to amd-staging-4.9.

Patch #4:
> +    /* The next three are used during VM update by CPU */
> +    bool update_by_cpu;
We can distinguish that as well by looking at the function pointer, 
don't we?

> +        dev_warn(adev->dev,
> +             "CPU update of VM failed. Fallback to SDMA\n");
> +
> +        /* Reset params for SDMA fallback path */
> +        params.update_by_cpu = false;
> +        params.pages_addr = NULL;
> +        params.kptr = NULL;
> +        params.func = NULL;
Again completely drop the SDMA fallback path.

Regards,
Christian.

Am 09.05.2017 um 20:34 schrieb Kasiviswanathan, Harish:
> Hi,
>
> Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update.
>
> Best Regards,
> Harish
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170510/c9a7eff3/attachment.html>