I merged the last two patch series and made a number of updates as discussed. I also rebased it on Alex's drm-next-4.14-wip branch. All patches applied cleanly without having to resolve any conflicts. I'm also publishing my branch with this patch series on GitHub to make it easier to test and apply the changes. This KFD branch requires the corresponding Thunk branch from GitHub, which uses a different ioctl ABI from the last ROCm release. The rest of the user mode stack should be OK on top of that Thunk. https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/tree/fkxamd/drm-next-wip https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/fxkamd/drm-next-wip On CZ, hsaconformance passes. OpenCL tests can be run with the current ROCm OpenCL stack. I ran the SHOC benchmark as an example. On KV, the current ROCm OpenCL driver doesn't work. This is a limitation of the OpenCL driver. I'm trying to find out how hard it would be to change that. In the mean time, some hsaconformance tests can be run on KV. It passes most tests up to 162:code_recursive_kernel_function, where it hangs. For testing KV with current user mode stack, please use amdgpu. I don't expect this to work with radeon and I'm not planning to spend any effort on making radeon work with a current user mode stack. Dropped patches: * drm/amdkfd: Fix double Mutex lock order Added patches: * drm/radeon: Return dword offsets of address watch registers * drm/amdgpu: Program SH_STATIC_MEM_CONFIG globally, not per-VMID Reordered patch sequence: I moved "drm/amdgpu: Remove hard-coded assumptions about compute pipes" just after "drm/amdkfd: Fix allocated_queues bitmap initialization" because these two patches fix related problems (how queues are shared between KFD and KGD). Updated patches: * drm/amdkfd: Clean up KFD style errors and warnings v2 * drm/amdkfd: Fix goto usage v2 * drm/amdkfd: Handle remaining BUG_ONs more gracefully v2 * drm/amdkfd: Add more error printing to help bringup v2 * drm/amdgpu: Add kgd/kfd interface to support scratch memory v2 * drm/amdkfd: Adding new IOCTL for scratch memory v2 * drm/amdgpu: Add kgd kfd interface get_tile_config() v2 * drm/amdkfd: Implement image tiling mode support v2 Patches that still need a Reviewed-by or Acked-by: * drm/radeon: Return dword offsets of address watch registers * drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t) * drm/amdkfd: Clean up KFD style errors and warnings v2 * drm/amdkfd: Handle remaining BUG_ONs more gracefully v2 * drm/amdkfd: Add more error printing to help bringup v2 * drm/amdgpu: Program SH_STATIC_MEM_CONFIG globally, not per-VMID * drm/amdgpu: Add kgd/kfd interface to support scratch memory v2 * drm/amdkfd: Adding new IOCTL for scratch memory v2 *** Felix Kuehling (13): drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts drm/radeon: Return dword offsets of address watch registers drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t) drm/amdkfd: Fix allocated_queues bitmap initialization drm/amdgpu: Remove hard-coded assumptions about compute pipes drm/amdkfd: Remove BUG_ONs for NULL pointer arguments drm/amdkfd: Fix doorbell initialization and finalization drm/amdkfd: Allocate gtt_sa_bitmap in long units drm/amdkfd: Handle remaining BUG_ONs more gracefully v2 drm/amdkfd: Update PM4 packet headers drm/amdgpu: Disable GFX PG on CZ drm/amd: Update MEC HQD loading code for KFD drm/amdgpu: Program SH_STATIC_MEM_CONFIG globally, not per-VMID Jay Cornwall (1): drm/amdkfd: Clamp EOP queue size correctly on Gfx8 Kent Russell (5): drm/amdkfd: Clean up KFD style errors and warnings v2 drm/amdkfd: Consolidate and clean up log commands drm/amdkfd: Change x==NULL/false references to !x drm/amdkfd: Fix goto usage v2 drm/amdkfd: Remove usage of alloc(sizeof(struct... Moses Reuben (2): drm/amdgpu: Add kgd/kfd interface to support scratch memory v2 drm/amdkfd: Adding new IOCTL for scratch memory v2 Yong Zhao (3): drm/amdkfd: Add more error printing to help bringup v2 drm/amdgpu: Add kgd kfd interface get_tile_config() v2 drm/amdkfd: Implement image tiling mode support v2 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 16 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 189 ++++++++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 227 ++++++++++++-- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 3 +- drivers/gpu/drm/amd/amdgpu/vi.c | 3 +- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 183 ++++++++---- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 102 +++---- drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c | 21 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h | 27 +- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 123 ++++---- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 316 ++++++++------------ .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c | 8 +- .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c | 8 +- drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 40 +-- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 33 +-- drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 63 ++-- drivers/gpu/drm/amd/amdkfd/kfd_module.c | 10 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 3 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 62 ++-- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 46 +-- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 294 ++++++------------ drivers/gpu/drm/amd/amdkfd/kfd_pasid.c | 7 +- drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h | 330 +++------------------ drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 140 ++++++++- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 32 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 25 +- .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 71 ++--- drivers/gpu/drm/amd/amdkfd/kfd_queue.c | 12 +- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 46 +-- drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 30 +- drivers/gpu/drm/radeon/radeon_kfd.c | 15 +- include/uapi/linux/kfd_ioctl.h | 37 ++- 36 files changed, 1280 insertions(+), 1252 deletions(-) -- 2.7.4