When amdgpu_cs_wait_ioctl is called with a timeout of zero, the caller is just interested in the current status of the fence. The default implementation of dma_fence_wait_timeout on an unsignaled fence will always call schedule_timeout(), even if the timeout is zero. This may result in significant overhead for clients that heavily use this interface. This patch avoids the dma_fence_wait_timeout overhead by directly checking the fence status. Signed-off-by: Andres Rodriguez <andresx7 at gmail.com> --- I'm not sure if we should be working around this issue at the amdgpu level, or at fixing the dma_fence_default_wait level instead. Source2 games like dota2 are affected by this overhead. This patch improves dota2 perf on a i7-6700k+RX480 system from 72fps->81fps. Patch is for drm-next-4.12-wip since this branch is where we operate on dma_fences directly. drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 ++++- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 5 +++++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index ec71b93..67a5c9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1168,7 +1168,10 @@ int amdgpu_cs_wait_ioctl(struct drm_device *dev, void *data, if (IS_ERR(fence)) r = PTR_ERR(fence); else if (fence) { - r = dma_fence_wait_timeout(fence, true, timeout); + if (timeout) + r = dma_fence_wait_timeout(fence, true, timeout); + else + r = amdgpu_fence_test_signaled(fence); dma_fence_put(fence); } else r = 1; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 7b60fb7..779a382 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -122,6 +122,11 @@ static u32 amdgpu_fence_read(struct amdgpu_ring *ring) return seq; } +bool amdgpu_fence_test_signaled(struct dma_fence *fence) +{ + return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags); +} + /** * amdgpu_fence_emit - emit a fence on the requested ring * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 944443c..6bbd31d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -84,6 +84,7 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, unsigned irq_type); void amdgpu_fence_driver_suspend(struct amdgpu_device *adev); void amdgpu_fence_driver_resume(struct amdgpu_device *adev); +bool amdgpu_fence_test_signaled(struct dma_fence *fence); int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence); void amdgpu_fence_process(struct amdgpu_ring *ring); int amdgpu_fence_wait_empty(struct amdgpu_ring *ring); -- 2.9.3