Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Russel,

On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote:
> It's necessary.  Take a moment to think carefully about this:
> 
>         dma_map_single(, dir)
> 
>         dma_sync_single_for_cpu(, dir)
> 
>         dma_sync_single_for_device(, dir)
> 
>         dma_unmap_single(, dir)
> 
> In the case of a DMA-incoherent architecture, the operations done at each
> stage depend on the direction argument:
> 
>         map             for_cpu         for_device      unmap
> TO_DEV  writeback       none            writeback       none
> TO_CPU  invalidate      invalidate*     invalidate      invalidate*
> BIDIR   writeback       invalidate      writeback       invalidate
> 
> * - only necessary if the CPU speculatively prefetches.

I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even
if CPU doesn't preferch - what if we reuse the same buffer for multiple
reads from DMA device?

> The multiple invalidations for the TO_CPU case handles different
> conditions that can result in data corruption, and for some CPUs, all
> four are necessary.

I would agree that map()/unmap() a quite a special cases and so depending
on direction we need to execute in them either for_cpu() or for_device()
call-backs depending on direction.

As for invalidation in case of for_device(TO_CPU) I still don't see
a rationale behind it. Would be interesting to see a real example where
we benefit from this.

> This is what is implemented for 32-bit ARM, depending on the CPU
> capabilities, as we have DMA incoherent devices and we have CPUs that
> speculatively prefetch data, and so may load data into the caches while
> DMA is in operation.
> 
> 
> Things get more interesting if the implementation behind the DMA API has
> to copy data between the buffer supplied to the mapping and some DMA
> accessible buffer:
> 
>         map             for_cpu         for_device      unmap
> TO_DEV  copy to dma     none            copy to dma     none
> TO_CPU  none            copy to cpu     none            copy to cpu
> BIDIR   copy to dma     copy to cpu     copy to dma     copy to cpu
> 
> So, in both cases, the value of the direction argument defines what you
> need to do in each call.

Interesting enough in your seond table (which describes more complicated
case indeed) you set "none" for for_device(TO_CPU) which looks logical
to me.

So IMHO that's what make sense:
---------------------------->8-----------------------------
        map             for_cpu         for_device      unmap
TO_DEV  writeback       none            writeback       none
TO_CPU  none            invalidate      none            invalidate*
BIDIR   writeback       invalidate      writeback       invalidate*
---------------------------->8-----------------------------

* is the case for prefetching CPU.

-Alexey��.n��������+%������w��{.n������ܖ)�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux