Re: Cache maintenance for non-coherent DMA in arch_sync_dma_for_device()

Ard Biesheuvel <ardb@xxxxxxxxxx> · Mon, 6 Jun 2022 18:15:13 +0200

On Mon, 6 Jun 2022 at 17:36, Russell King (Oracle)
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, Jun 06, 2022 at 04:21:50PM +0100, Will Deacon wrote:
> >   (1) What if the DMA transfer doesn't write to every byte in the buffer?
>
> The data that is in RAM gets pulled into the cache and is visible to
> the CPU - but if DMA doesn't write to every byte in the buffer, isn't
> that a DMA failure? Should a buffer that suffers DMA failure be passed
> to the user?
>
> >   (2) What if the buffer has a virtual alias in userspace (e.g. because
> >       the kernel has GUP'd the buffer?
>
> Then userspace needs to avoid writing to cachelines that overlap the
> buffer to avoid destroying the action of the DMA. It shouldn't be doing
> this anyway (what happens if userspace writes to the same location that
> is being DMA'd to... who wins?)
>
> However, you're right that invalidating in this case could expose data
> that userspace shouldn't see, and I'd suggest in this case that DMA
> buffers should be cleaned in this circumstance before they're exposed
> to userspace - so userspace only ever gets to see the data that was
> there at the point they're mapped, or is subsequently written to
> afterwards by DMA.
>
> I don't think there's anything to be worried about if the invalidation
> reveals stale data provided the stale data is not older than the data
> that was there on first mapping.
>

Given that cache invalidate without clean could potentially nullify
the effect of, e.g., a memzero_explicit() call, I think a clean is
definitely safer, but  OTOH, it is also more costly, and not strictly
necessary for correctness of the DMA operation itself.

So I agree with the suggested change, as long as it annotated
sufficiently clearly to make the above distinction.

> > Finally, on arm(64), the DMA mapping code tries to deal with buffers
> > that are not cacheline aligned by issuing clean-and-invalidate
> > operations for the overlapping portions at each end of the buffer. I
> > don't think this makes a tonne of sense, as inevitably one of the
> > writers (either the CPU or the DMA) is going to win and somebody is
> > going to run into silent data loss. Having the caller receive
> > DMA_MAPPING_ERROR in this case would probably be better.
>
> Sadly unavoidable - people really like passing unaligned buffers to the
> DMA API, sometimes those buffers contain information that needs to be
> preserved. I really wish it wasn't that way, because it would make life
> a lot better, but it's what we've had to deal with over the years with
> the likes of the SCSI subsystem (and e.g. it's sense buffer that was
> embedded non-cacheline aligned into other structures that had to be
> DMA'd to.)
>

As discussed in the thread related to Catalin's DMA granularity
series, this is something I think we should be addressing with bounce
buffering for inbound DMA. This would allow us to reduce the kmalloc
alignment as well.