That commit causes the screen to freeze a few moments after running clinfo on v6.6-rc7 and ROCm 5.6. Sometimes the rest of the computer including ssh also freezes. On v6.5-rc1, it only results in a NULL pointer deference message in dmesg and the process to become a zombie whose unkillableness prevents shutdown without REISUB. Although llama.cpp and hashcat were working in v6.2 and ROCm 5.6, broke, and are not fixed by this revert, pytorch-rocm is now working with stability and without whole-computer freezes caused by any accidental running of clinfo. This reverts commit 1d7776cc148b9f2f3ebaf1181662ba695a29f639. Closes: https://github.com/RadeonOpenCompute/ROCm/issues/2596 Signed-off-by: Daniel Tang <danielzgtg.opensource@xxxxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 82f25996ff5e..602f311ab766 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2243,16 +2243,16 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm) if (r) return r; + /* Sanity checks */ + if (!amdgpu_vm_pt_is_root_clean(adev, vm)) { + r = -EINVAL; + goto unreserve_bo; + } + /* Check if PD needs to be reinitialized and do it before * changing any other state, in case it fails. */ if (pte_support_ats != vm->pte_support_ats) { - /* Sanity checks */ - if (!amdgpu_vm_pt_is_root_clean(adev, vm)) { - r = -EINVAL; - goto unreserve_bo; - } - vm->pte_support_ats = pte_support_ats; r = amdgpu_vm_pt_clear(adev, vm, to_amdgpu_bo_vm(vm->root.bo), false); -- 2.40.1