Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I've been watching this from the sidelines mostly, and discussing a
bit with Toke, but:

On Fri, 2022-03-25 at 11:30 -0700, Linus Torvalds wrote:
> 
>  (2) The CPU now wants to see any state written by the device since
> the last sync
> 
>     This is "dma_sync_single_for_cpu(DMA_FROM_DEVICE)".
> 
>     A bounce-buffer implementation needs to copy *from* the bounce buffer.
> 
>     A cache-coherent implementation needs to do nothing.
> 
>     A non-coherent implementation maybe needs to do nothing (ie it
> assumes that previous ops have flushed the cache, and just accessing
> the data will bring the rigth thing back into it). Or it could just
> flush the cache.

Doesn't that just need to *invalidate* the cache, rather than *flush*
it? The cache is somewhat similar to the bounce buffer, and here you're
copying _from_ the bounce buffer (which is where the device is
accessing), so shouldn't it be the same for the cache, i.e. you
invalidate it so you read again from the real memory?

>  (3) The CPU has seen the state, but wants to leave it to the device
> 
>    This is "dma_sync_single_for_device(DMA_FROM_DEVICE)".
> 
>    A bounce buffer implementation needs to NOT DO ANYTHING (this is
> the current ath9k bug - copying to the bounce buffer is wrong)
> 
>   A cache coherent implementation needs to do nothing
> 
>   A non-coherent implementation needs to flush the cache again, bot
> not necessarily do a writeback-flush if there is some cheaper form
> (assuming it does nothing in the "CPU now wants to see any state" case
> because it depends on the data not having been in the caches)

And similarly here, it would seem that the implementation can't _flush_
the cache as the device might be writing concurrently (which it does in
fact do in the ath9k case), but it must invalidate the cache?

I'm not sure about the (2) case, but here it seems fairly clear cut that
if you have a cache, don't expect the CPU to write to the buffer (as
evidenced by DMA_FROM_DEVICE), you wouldn't want to write out the cache
to DRAM?


I'll also note independently that ath9k actually maps the buffers as
DMA_BIDIRECTIONAL, but the flush operations happen with DMA_FROM_DEVICE,
at least after the setup is done. I must admit that I was scratching my
head about this, I had sort of expected one should be passing the same
DMA direction to all different APIs for the same buffer, but clearly, as
we can see in your list of cases here, that's _not_ true.


Then, however, we need to define what happens if you pass
DMA_BIDIRECTIONAL to the sync_for_cpu() and sync_for_device() functions,
which adds two more cases? Or maybe we eventually just think that's not
valid at all, since you have to specify how you're (currently?) using
the buffer, which can't be DMA_BIDIRECTIONAL?


>  (4) There is a fourth case: dma_sync_single_for_cpu(DMA_TO_DEVICE)
> which maybe should generate a warning because it seems to make no
> sense? I can't think of a case where this would be an issue - the data
> is specifically for the device, but it's synced "for the CPU"?

I'd tend to agree with that, that's fairly much useless, since if only
the CPU wrote to it, then you wouldn't care about any caching or bounce
buffers, so no need to sync back.

> In other words, I think commit aa6f8dcbab47 ("swiotlb: rework 'fix
> info leak with DMA_FROM_DEVICE'") is fundamentally wrong. It doesn't
> just break ath9k, it fundamentally break that "case 3" above. It's
> doing a DMA_TO_DEVICE copy, even though it was a DMA_FROM_DEVICE sync.
> 
> So I really think that "revert aa6f8dcbab47" is not only inevitable
> because of practical worries about what it breaks, but because that
> commit was just entirely and utterly WRONG.

Honestly, I was scratching my head about this too - sadly it just says
"what was agreed", without a pointer to how that was derived, but it
seemed that the original issue was:

 "we're leaking old bounce buffer data to the device"

or was it not? In which case doing any copies during map should've been
sufficient, since then later no more data leaks could occur?

johannes



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux