On 2018-07-10 20:03, Lis, Tomasz wrote:
On 2018-07-09 18:28, Tvrtko Ursulin wrote:
On 09/07/2018 14:20, Tomasz Lis wrote:
diff --git a/drivers/gpu/drm/i915/intel_lrc.h
b/drivers/gpu/drm/i915/intel_lrc.h
index 1593194..f6965ae 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
[...]
+/*
+ * When data port level coherency is enabled, the GPU will update
memory
+ * buffers shared with CPU, by forcing internal cache units to send
memory
+ * writes to real RAM faster. Keeping such coherency has
performance cost.
Is this comment correct? Is it actually sending memory writes to
_RAM_, or just the coherency mode enabled, even if only targetting
CPU or shared cache, which adds a cost?
I'm not sure whether there are further coherency modes to choose how
"deep" coherency goes. The use case of OCL Team is to see gradual
changes in the buffers on CPU side while the execution progresses.
Write to RAM is needed to achieve that. And that limits performance by
using RAM bandwidth.
It was pointed out to me that last level cache is shared between CPU and
GPU on non-atoms. Which means my argument was invalid, an most likely
the coherency option does not enforce RAM write. I will update the comment.
-Tomasz
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx