This series fixes some KFD issues, adds robustness enhancements and finally a few cleanups. Patches 1-4 are important fixes. Patches 5-8 add handling of GPU VM faults Patches 9-22 add handling of GPU resets and detection of HWS hangs Patches 23-25 are various cleanups Felix Kuehling (2): drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock drm/amdkfd: Stop using GFP_NOIO explicitly Jay Cornwall (1): drm/amdkfd: Fix race between scheduler and context restore Lan Xiao (1): drm/amdkfd: fix zero reading of VMID and PASID for Hawaii Moses Reuben (1): drm/amdkfd: When we get KFD_EVENT_TYPE_MEMORY we send the process SIGSEGV Shaoyun Liu (13): drm/amd: Add gpu reset interfaces between amdgpu and amdkfd drm/amd: Add kfd ioctl defines for hw_exception event drm/amdkfd: Add gpu reset interface and place holder drm/amdgpu: Call KFD reset handlers during GPU reset drm/amdkfd: Implement GPU reset handlers in KFD drm/amdgpu: Enable the gpu reset from KFD drm/amdkfd: Implement hang detection in KFD and call amdgpu drm/amdgpu: Don't use shadow BO for compute context drm/amdgpu: Check NULL pointer for job before reset job's ring drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation drm/amdgpu: Avoid invalidate tlbs when gpu is on reset drm/amdgpu: Avoid destroy hqd when GPU is on reset drm/amdkfd: Add debugfs interface to trigger HWS hang Wei Lu (1): drm/amdkfd: Fix error codes in kfd_get_process Yong Zhao (4): drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang drm/amdkfd: Use module parameters noretry as the internal variable name drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager drm/amdkfd: Clean up reference of radeon shaoyunl (2): drm/amdgpu: get_vm_fault implementation on amdgpu side drm/amdkfd: Handle VM faults in KFD drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 27 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 9 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 26 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 8 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 14 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 33 +- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 33 +- drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 54 ++- drivers/gpu/drm/amd/amdkfd/cik_int.h | 7 +- drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 458 +++++++++++---------- .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 18 +- .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 16 +- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 + drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 1 - drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.h | 37 ++ drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c | 48 +++ drivers/gpu/drm/amd/amdkfd/kfd_device.c | 94 ++++- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 247 ++++++----- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 26 +- .../drm/amd/amdkfd/kfd_device_queue_manager_v9.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 9 +- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 71 ++++ drivers/gpu/drm/amd/amdkfd/kfd_events.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 22 +- drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 17 +- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.h | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_module.c | 16 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 26 ++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 34 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 + .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +- drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 35 ++ include/uapi/linux/kfd_ioctl.h | 22 +- 41 files changed, 1081 insertions(+), 390 deletions(-) -- 2.7.4