Hi Harish, for the next time please use git send-email to send patches to the mailing list. Looks pretty good in general, but a few notes all over the place. Patch #1: > +static bool amdgpu_vm_is_large_bar(struct amdgpu_device *adev) > +{ > + if (adev->mc.visible_vram_size > 0 && > + adev->mc.real_vram_size == adev->mc.visible_vram_size) > + return true; > + return false; > +} The visible_vram_size > 0 check looks superfluous. The coding style looks off, the second line of the "if" is to far right. And in general that should rather look like "return adev->mc.real_vram_size == adev->mc.visible_vram_size;". > + /* CPU update is only supported for large BAR system */ > + vm->is_vm_update_mode_cpu = (vm_update_mode && > + amdgpu_vm_is_large_bar(adev)); Mhm, it would be nice if we could enable this for testing even on systems with small BAR. I would rather suggest to make the default value depend on if the BOX has a large or small BAR and let the user override the setting. Additional to that we should limit this to 64bit systems. > +#define AMDGPU_VM_USE_CPU_UPDATE_FOR_GRAPHICS_MASK (1 << 0) > +#define AMDGPU_VM_USE_CPU_UPDATE_FOR_COMPUTE_MASK (1 << 1) Not much of an issue, but the names are a bit long. Maybe use something like AMDGPU_VM_USE_CPU_FOR_GFX and AMDGPU_VM_USE_CPU_FOR_COMPUTE. > + bool is_vm_update_mode_cpu : 1; Please no bitmasks when we don't need them. Patch #2: > + u64 flags; > unsigned shift = (adev->vm_manager.num_level - level) * > adev->vm_manager.block_size; Reverse tree order. > + flags = (vm->is_vm_update_mode_cpu) ? > + AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED : > + AMDGPU_GEM_CREATE_NO_CPU_ACCESS | > + AMDGPU_GEM_CREATE_SHADOW; > + > + ... > - AMDGPU_GEM_CREATE_NO_CPU_ACCESS | > - AMDGPU_GEM_CREATE_SHADOW | > + flags | > AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | > AMDGPU_GEM_CREATE_VRAM_CLEARED, I would rather write it like this: flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | AMDGPU_GEM_CREATE_VRAM_CLEARED; if (vm->is_vm_update_mode_cpu) flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; else flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS | AMDGPU_GEM_CREATE_SHADOW; > + mb(); > + amdgpu_gart_flush_gpu_tlb(params->adev, 0); You do this for the HDP flush, don't you? > + dev_warn(adev->dev, "Page table update using CPU failed. > Fallback to SDMA\n"); NACK, if kmapping the BO fails we most likely are either out of memory or out of address space. Falling back to the SDMA doesn't make the situation better, just return a proper error code here. > + /* Wait for BO to be free. SDMA could be clearing it */ > + amdgpu_sync_create(&sync); > + amdgpu_sync_resv(adev, &sync, parent->bo->tbo.resv, > + AMDGPU_FENCE_OWNER_VM); > + amdgpu_sync_wait(&sync); > + amdgpu_sync_free(&sync); Superfluous and a bit incorrect, amdgpu_bo_kmap already calls reservation_object_wait_timeout_rcu() but only for the exclusive fence. You probably ran into problems because of pipelined evictions, so that should be fixed in amdgpu_bo_kmap() instead. > - amdgpu_vm_do_set_ptes(¶ms, > - last_shadow, > - pt_addr, count, > - incr, > - AMDGPU_PTE_VALID); > - > - amdgpu_vm_do_set_ptes(¶ms, last_pde, > - pt_addr, count, incr, > - AMDGPU_PTE_VALID); > + params.func(¶ms, > + last_shadow, > + pt_addr, count, > + incr, > + AMDGPU_PTE_VALID); > + > + params.func(¶ms, last_pde, > + pt_addr, count, incr, > + AMDGPU_PTE_VALID); Might be a good idea to separate that into a cleanup patch. Patch #3: Looks good to me and is Reviewed-by: Christian König <christian.koenig at amd.com>. Please move that one to the beginning of the series and/or commit it directly to amd-staging-4.9. Patch #4: > + /* The next three are used during VM update by CPU */ > + bool update_by_cpu; We can distinguish that as well by looking at the function pointer, don't we? > + dev_warn(adev->dev, > + "CPU update of VM failed. Fallback to SDMA\n"); > + > + /* Reset params for SDMA fallback path */ > + params.update_by_cpu = false; > + params.pages_addr = NULL; > + params.kptr = NULL; > + params.func = NULL; Again completely drop the SDMA fallback path. Regards, Christian. Am 09.05.2017 um 20:34 schrieb Kasiviswanathan, Harish: > Hi, > > Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update. > > Best Regards, > Harish > > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170510/c9a7eff3/attachment.html>