On Mon, Mar 19, 2018 at 9:05 PM, Felix Kuehling <felix.kuehling at amd.com> wrote: > On 2018-03-19 12:39 PM, Christian König wrote: >> So coming back to this series once more. >> >> Patch #1, #3 are Reviewed-by: Christian König <christian.koenig at amd.com>. >> >> Patch #2, #4 - #13 and #18-#19 are Acked-by: Christian König >> <christian.koenig at amd.com>. >> >> Patch #14: What's the difference to setting vramlimit=$size_of_bar ? > > The debug_largebar option only affects KFD. Graphics can still use all > memory. > >> >> Patch #15 & #20: Why is that actually still needed? I thought we have >> fixed all dependencies and can now use the "standard" way of attaching >> fences to reservation objects to do this. > > Patch 15 adds a KFD-specific MMU notifier, because the graphics MMU > notifier deals with amdgpu_cs command submission. We need a completely > different treatment of MMU notifiers for KFD. We need to stop user mode > queues. > > Patch 20 implements the user mode queue preemption mechanism, and the > corresponding restore function that re-validates userptr BOs before > restarting user mode queues. > > I think you're implying that the graphics MMU notifier would wait for > the eviction fence and trigger a regular eviction in KFD. I haven't > tried that. The MMU notifier and userptr eviction mechanism was > implemented in KFD before we had the TTM evictions. We didn't go back > and revisit it after that. There are a few major differences that make > me want to keep the two types of evictions separate, though: > > * TTM evictions are the result of memory pressure (triggered by > another process) > o Most MMU notifiers are triggered by something in the same > process (fork, mprotect, etc.) > o Thus the restore delay after MMU notifiers can be much shorter > * TTM evictions are done asynchronously in a delayed worker > o MMU notifiers are synchronous, queues need to be stopped before > returning > * KFD's TTM eviction/restore code doesn't handle userptrs > (get_user_pages, etc) > o MMU notifier restore worker is specialized to handle just userptrs > > >> >> Saying so I still need to take a closer look at patch #20. >> >> Patch #16: Looks good to me in general, but I think it would be safer >> if we grab a reference to the task structure. Otherwise grabbing pages >> from a mm_struct sounds a bit scary to me. > > You're right. I've never seen it cause problems when testing process > termination, probably because during process termination KFD cancels the > delayed worker that calls this function. But I'll fix this and take a > proper reference. > >> >> Patch #17: I think it would be better to allocate the node when the >> locks are not held and free it when we find that it isn't used, but no >> big deal. > > OK. I'll change that. > > Thanks, > Felix > >> >> Regards, >> Christian. >> >> Am 15.03.2018 um 22:27 schrieb Felix Kuehling: >>> Rebased and integrated review feedback from v3: >>> * Removed vm->vm_context field >>> * Use uninterruptible waiting in initial PD validation to avoid >>> ERESTARTSYS >>> * Return number of successful map/unmap operations in failure cases >>> * Facilitate partial retry after failed map/unmap >>> * Added comments with parameter descriptions to new APIs >>> * Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only >>> >>> This patch series also adds Userptr support in patches 15-20. >>> >>> Felix Kuehling (19): >>> drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm >>> drm/amdgpu: Fix initial validation of PD BO for KFD VMs >>> drm/amdgpu: Add helper to turn an existing VM into a compute VM >>> drm/amdgpu: Add kfd2kgd interface to acquire an existing VM >>> drm/amdkfd: Create KFD VMs on demand >>> drm/amdkfd: Remove limit on number of GPUs >>> drm/amdkfd: Aperture setup for dGPUs >>> drm/amdkfd: Add per-process IDR for buffer handles >>> drm/amdkfd: Allocate CWSR trap handler memory for dGPUs >>> drm/amdkfd: Add TC flush on VMID deallocation for Hawaii >>> drm/amdkfd: Add ioctls for GPUVM memory management >>> drm/amdkfd: Kmap event page for dGPUs >>> drm/amdkfd: Add module option for testing large-BAR functionality >>> drm/amdgpu: Add MMU notifier type for KFD userptr >>> drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads >>> drm/amdgpu: GFP_NOIO while holding locks taken in MMU notifier >>> drm/amdkfd: GFP_NOIO while holding locks taken in MMU notifier >>> drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_calls >>> drm/amdgpu: Add userptr support for KFD >>> >>> Oak Zeng (1): >>> drm/amdkfd: Populate DRM render device minor >>> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 37 +- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 1 + >>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 1 + >>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 818 >>> ++++++++++++++++++--- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 96 ++- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 11 +- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 30 +- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 70 ++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 10 + >>> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 532 >>> ++++++++++++++ >>> drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 + >>> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 40 +- >>> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 22 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_events.c | 31 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 59 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_module.c | 7 + >>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 2 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 2 +- >>> drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 37 + >>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 41 ++ >>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 314 +++++++- >>> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 + >>> drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 + >>> drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 10 + >>> include/uapi/linux/kfd_ioctl.h | 122 ++- >>> 26 files changed, 2090 insertions(+), 213 deletions(-) >>> >> > Hi Felix, I did a quick pass on the patch-set and didn't see anything scary. Patches 1-14 are already applied to my -next tree. If I will send it now to Dave I believe we would be ok from schedule POV. I suggest we delay userptr support for the next kernel release because you need to address what Christian said and I also want to take a closer look at it. What do you think ? Thanks, Oded