[Linaro-mm-sig] noveau vs arm dma ops

daniel.vetter@xxxxxxxx (Daniel Vetter) · Thu, 26 Apr 2018 11:39:09 +0200

On Thu, Apr 26, 2018 at 11:24 AM, Christoph Hellwig <hch at infradead.org> wrote:
> On Thu, Apr 26, 2018 at 11:20:44AM +0200, Daniel Vetter wrote:
>> The above is already what we're implementing in i915, at least
>> conceptually (it all boils down to clflush instructions because those
>> both invalidate and flush).
>
> The clwb instruction that just writes back dirty cache lines might
> be very interesting for the x86 non-coherent dma case.  A lot of
> architectures use their equivalent to prepare to to device transfers.

Iirc didn't help for i915 use-cases much. Either data gets streamed
between cpu and gpu, and then keeping the clean cacheline around
doesn't buy you anything. In other cases we need to flush because the
gpu really wants to use non-snooped transactions (faster/lower
latency/less power required for display because you can shut down the
caches), and then there's also no benefit with keeping the cacheline
around (no one will ever need it again).

I think clwb is more for persistent memory and stuff like that, not so
much for gpus.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch