On Tue, Jul 12, 2011 at 10:21:05PM +0800, Felix Fietkau wrote: > On 12.07.2011, at 21:03, Michał Mirosław <mirq-linux@xxxxxxxxxxxx> wrote: > > > On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote: > >> On 12.07.2011, at 17:55, Michał Mirosław <mirq-linux@xxxxxxxxxxxx> wrote: > >> > >>> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote: > >>>> On 2011-07-11 8:52 AM, Michał Mirosław wrote: > >>>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify > >>>>> assumptions --- dma_sync_single_for_device() call can be removed. > >>>>> > >>>>> Signed-off-by: Michał Mirosław<mirq-linux@xxxxxxxxxxxx> > >>>>> --- > >>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++-- > >>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +- > >>>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++------- > >>>>> 3 files changed, 6 insertions(+), 10 deletions(-) > >>>>> > >>>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c > >>>>> index 70dc8ec..c5f46d5 100644 > >>>>> --- a/drivers/net/wireless/ath/ath9k/recv.c > >>>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c > >>>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc, > >>>>> BUG_ON(!bf); > >>>>> > >>>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr, > >>>>> - common->rx_bufsize, DMA_FROM_DEVICE); > >>>>> + common->rx_bufsize, DMA_BIDIRECTIONAL); > >>>>> > >>>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data); > >>>>> - if (ret == -EINPROGRESS) { > >>>>> - /*let device gain the buffer again*/ > >>>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr, > >>>>> - common->rx_bufsize, DMA_FROM_DEVICE); > >>>>> + if (ret == -EINPROGRESS) > >>>>> return false; > >>>>> - } > >>>>> > >>>>> __skb_unlink(skb,&rx_edma->rx_fifo); > >>>>> if (ret == -EINVAL) { > >>>> I have strong doubts about this change. On most MIPS devices, > >>>> dma_sync_single_for_cpu is a no-op, whereas > >>>> dma_sync_single_for_device flushes the cache range. With this > >>>> change, the CPU could cache the DMA status part behind skb->data and > >>>> that cache entry would not be flushed inbetween calls to this > >>>> functions on the same buffer, likely leading to rx stalls. > >>> You're suggesting a platform implementation bug then. If the platform is not > >>> cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu > >>> and unmap cases. Do other devices show such symptoms on MIPS systems? > >>> > >>> I'm not familiar with the platform internals, so we should ask MIPS people. > >> I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read. > > What you're saying reminds the wording in DMA-API-HOWTO.txt that I find > > wrong (or at least misleading) compared to what DMA-API.txt describes. > > DMA sync calls do not transfer the ownership of the buffer - they are > > cache synchronization points, ownership passing is handled entirely by > > the driver. > What I meant was that the DMA sync calls reflect the ownership transfer of the memory regions. In this case ownership is transferred between device and CPU multiple times and the code reflects that. > >> This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it. > > Correctness of this access should be provided by sync_to_cpu() call. > At least in MIPS I'm sure it isn't. If I remember correctly, it also isn't on ARM, so I'm pretty sure that either your understanding of the API is incorrect, or arch code does not implement it properly. In either case, this change (and probably also the p54 one) should not be merged. I briefly looked through DMA API implementation in MIPS, and except for R10k and R12k both sync_for_cpu and sync_for_device are no-ops (see: arch/mips/mm/dma-default.c). For R10k and R12k the syncs are in both points, and exactly like I described before - CPU cachelines are invalidated for DMA_FROM_DEVICE mappings, written back for DMA_TO_DEVICE, both for DMA_BIDIRECTIONAL (including redundant mapping+sync direction). So doing that sync_to_device you are just invalidating the same cachelines twice for no gain (or do nothing twice in some cases) - they are not read by CPU between sync_to_device -> sync_to_cpu (unless you have other bugs in the driver). Best Regards, Michał Mirosław -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html