On 10/17/10 6:22 AM, Kevin Cernekee wrote:
On processors with deep write buffers, it is likely that many cycles will pass between a CACHE instruction and the time the data actually gets written out to DRAM. Add a SYNC instruction to ensure that the buffers get emptied before the flush functions return. Actual problem seen in the wild: 1) dma_alloc_coherent() allocates cached memory 2) memset() is called to clear the new pages 3) dma_cache_wback_inv() is called to flush the zero data out to memory 4) dma_alloc_coherent() returns an uncached (kseg1) pointer to the freshly allocated pages 5) Caller writes data through the kseg1 pointer 6) Buffered writeback data finally gets flushed out to DRAM 7) Part of caller's data is inexplicably zeroed out This patch adds SYNC between steps 3 and 4, which fixed the problem. Signed-off-by: Kevin Cernekee<cernekee@xxxxxxxxx> --- arch/mips/mm/c-r4k.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c index 6721ee2..05c3de3 100644 --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm/c-r4k.c @@ -605,6 +605,7 @@ static void r4k_dma_cache_wback_inv(unsigned long addr, unsigned long size) r4k_blast_scache(); else blast_scache_range(addr, addr + size); + __sync(); return; }
Basically, agreed. I have similar workarounds when initiating DMA, where we need to flush out data to DRAM before starting DMA trans- actions. Looks like similar situations. But I have a concern. I suspect that SYNC insn alone is still not enough, insn't it? In such systems with that 'deep' write buffer and data incoherency is visibly observed, there sill may be data write transactions floating in the internal bus system. To make sure that all data (data inside processor's write buffer and data floating in the internal bus system), we need the following three steps: 1. Flush data cache 2. Uncached, dummy load operation from _DRAM_ (not somewhere else) 3. then SYNC instruction With these steps, data in write buffer will be pushed out of the processor's write buffer, wait for uncached load operation to be completed, and then finally the pipeline gets cleared. Thoughts? Shinya