This patch series to fix GPU generate random no-retry fault on APU with XNACK on. If updating GPU page table to use PDE0 as PTE, for example unmap 2MB align virtual address, then map same virtual address using transparent 2MB huge page, we free the PTE BO first and then flush TLB. If XNACK ON, H/W may access the freed old PTE page before TLB is flushed. On APU, the freed PTE BO system memory page maybe used and the content is changed, this causes H/W enerates unexpected no-retry fault. The fix is to add fence to the freed page table BO, and then signal the fence after TLB is flushed to really free the page table BO page. Philip Yang (4): drm/amdgpu: Implement page table BO fence drm/amdkfd: Signal page table fence after KFD flush tlb drm/amdgpu: Signal page table fence after gfx vm flush drm/amdgpu: Add fence to the freed page table BOs drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 45 +++++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 7 ++++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 33 +++++++++++------ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +++ 7 files changed, 86 insertions(+), 11 deletions(-) -- 2.35.1