Am 19.05.2017 um 04:25 schrieb Flora Cui: > On Thu, May 18, 2017 at 01:38:15PM +0200, Christian König wrote: >> Am 18.05.2017 um 09:45 schrieb Flora Cui: >>> partial revert commit <6971d3d> - drm/amdgpu: cleanup logic in >>> amdgpu_vm_flush >>> >>> Change-Id: Iadce9d613dfe9a739643a74050cea55854832adb >>> Signed-off-by: Flora Cui <Flora.Cui at amd.com> >> I don't see how the revert should be faster than the original. >> >> Especially that amdgpu_vm_had_gpu_reset() is now called twice sounds like >> more overhead than necessary. >> >> Please explain further. >> >> Christian. >> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 +++++--------- >>> 1 file changed, 5 insertions(+), 9 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> index 88420dc..a96bad6 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>> @@ -743,23 +743,19 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job) >>> id->gws_size != job->gws_size || >>> id->oa_base != job->oa_base || >>> id->oa_size != job->oa_size); >>> - bool vm_flush_needed = job->vm_needs_flush || >>> - amdgpu_vm_ring_has_compute_vm_bug(ring); >>> unsigned patch_offset = 0; >>> int r; >>> - if (amdgpu_vm_had_gpu_reset(adev, id)) { >>> - gds_switch_needed = true; >>> - vm_flush_needed = true; >>> - } >>> - >>> - if (!vm_flush_needed && !gds_switch_needed) >>> + if (!job->vm_needs_flush && !gds_switch_needed && >>> + !amdgpu_vm_had_gpu_reset(adev, id) && >>> + !amdgpu_vm_ring_has_compute_vm_bug(ring)) >>> return 0; >>> if (ring->funcs->init_cond_exec) >>> patch_offset = amdgpu_ring_init_cond_exec(ring); >>> - if (ring->funcs->emit_vm_flush && vm_flush_needed) { > [flora]: for compute ring & amdgpu_vm_ring_has_compute_vm_bug(), a vm_flush is > inserted. This might cause performance drop. Ah, I see. We only need the pipeline sync, but not the vm flush. In this case I suggest to just change the following line in amdgpu_vm_flush(): > - bool vm_flush_needed = job->vm_needs_flush || > - amdgpu_vm_ring_has_compute_vm_bug(ring); We can keep it in amdgpu_vm_need_pipeline_sync(). BTW: We should cache the result of amdgpu_vm_ring_has_compute_vm_bug() in the vm manager structure. Computing this on the fly for every command submissions is just a huge bunch of overhead. Regards, Christian.