On 2016å¹´07æ??01æ?¥ 17:30, Christian König wrote: > Am 30.06.2016 um 11:34 schrieb Chunming Zhou: >> Change-Id: If10da1e224d81a12fd4f8d760c48178adb9e82d0 >> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- >> 2 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index a3ca83f..0759c23 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -2002,8 +2002,9 @@ retry: >> struct amdgpu_ring *ring = adev->rings[i]; >> if (!ring) >> continue; >> + amd_sched_job_recovery(&ring->sched); >> kthread_unpark(ring->sched.thread); >> - amdgpu_ring_restore(ring, ring_sizes[i], ring_data[i]); >> + kfree(ring_data[i]); >> ring_sizes[i] = 0; >> ring_data[i] = NULL; >> } >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> index cced2f6..7393473 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> @@ -384,11 +384,11 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, >> amdgpu_ring_emit_pipeline_sync(ring); >> if (ring->funcs->emit_vm_flush && >> - pd_addr != AMDGPU_VM_NO_FLUSH) { >> + (pd_addr != AMDGPU_VM_NO_FLUSH || >> amdgpu_vm_is_gpu_reset(adev, id))) { >> struct fence *fence; >> trace_amdgpu_vm_flush(pd_addr, ring->idx, vm_id); >> - amdgpu_ring_emit_vm_flush(ring, vm_id, pd_addr); >> + amdgpu_ring_emit_vm_flush(ring, vm_id, id->pd_gpu_addr); > > NAK, we need to handle this differently. The problem is the > id->pd_gpu_addr could already be reseted when you have more than one > submission to the same engine. > > E.g. submission A1 uses VMID 1 and PD address A and submissing B1 uses > VMID1 as well but PD address B. When we do it like this we would use > PD address B for both submissions on restart. Ah, I just realized my brach doesn't have your "save the PD..." patch, which already save the PD addr in job, we can directly use it. > > I suggest to just drop the AMDGPU_VM_NO_FLUSH special value and use a > boolean to signal that a flush is needed instead. yes. Thanks, David Zhou > > Regards, > Christian. > >> r = amdgpu_fence_emit(ring, &fence); >> if (r) >