[Why] when a job is scheduled during TDR(after device reset count increase and before drm_sched_stop), this job won't do vm_flush when resubmit itself after GPU reset done. This can lead to a page fault. [How] Always do vm_flush for resubmit job. Signed-off-by: Jingwen Chen <Jingwen.Chen2@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index fdbe7d4e8b8b..4af2c5d15950 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1088,7 +1088,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid) adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid); - if (amdgpu_vmid_had_gpu_reset(adev, id)) { + if (amdgpu_vmid_had_gpu_reset(adev, id) || + (job->base.flags & DRM_FLAG_RESUBMIT_JOB)) { gds_switch_needed = true; vm_flush_needed = true; pasid_mapping_needed = true; -- 2.25.1 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx