Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Russel,

On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote:
It's necessary.  Take a moment to think carefully about this:

        dma_map_single(, dir)

        dma_sync_single_for_cpu(, dir)

        dma_sync_single_for_device(, dir)

        dma_unmap_single(, dir)

In the case of a DMA-incoherent architecture, the operations done at each
stage depend on the direction argument:

        map             for_cpu         for_device      unmap
TO_DEV  writeback       none            writeback       none
TO_CPU  invalidate      invalidate*     invalidate      invalidate*
BIDIR   writeback       invalidate      writeback       invalidate

* - only necessary if the CPU speculatively prefetches.

I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even
if CPU doesn't preferch - what if we reuse the same buffer for multiple
reads from DMA device?

The multiple invalidations for the TO_CPU case handles different
conditions that can result in data corruption, and for some CPUs, all
four are necessary.

I would agree that map()/unmap() a quite a special cases and so depending
on direction we need to execute in them either for_cpu() or for_device()
call-backs depending on direction.

As for invalidation in case of for_device(TO_CPU) I still don't see
a rationale behind it. Would be interesting to see a real example where
we benefit from this.

This is what is implemented for 32-bit ARM, depending on the CPU
capabilities, as we have DMA incoherent devices and we have CPUs that
speculatively prefetch data, and so may load data into the caches while
DMA is in operation.


Things get more interesting if the implementation behind the DMA API has
to copy data between the buffer supplied to the mapping and some DMA
accessible buffer:

        map             for_cpu         for_device      unmap
TO_DEV  copy to dma     none            copy to dma     none
TO_CPU  none            copy to cpu     none            copy to cpu
BIDIR   copy to dma     copy to cpu     copy to dma     copy to cpu

So, in both cases, the value of the direction argument defines what you
need to do in each call.

Interesting enough in your seond table (which describes more complicated
case indeed) you set "none" for for_device(TO_CPU) which looks logical
to me.

So IMHO that's what make sense:
---------------------------->8-----------------------------
        map             for_cpu         for_device      unmap
TO_DEV  writeback       none            writeback       none
TO_CPU  none            invalidate      none            invalidate*
BIDIR   writeback       invalidate      writeback       invalidate*
---------------------------->8-----------------------------

* is the case for prefetching CPU.

-Alexey��.n��������+%������w��{.n�����{��n����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux