On Mon, Jan 14, 2013 at 9:50 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote: > On Mon, 14 Jan 2013 21:04:06 +0100, Daniel Vetter <daniel at ffwll.ch> wrote: >> On Mon, Jan 14, 2013 at 8:21 PM, Imre Deak <imre.deak at intel.com> wrote: >> >> + drm_clflush_virt_range(vaddr + page_offset, 4); >> >> + *(uint32_t *)(vaddr + page_offset) = reloc->delta; >> >> + drm_clflush_virt_range(vaddr + page_offset, 4); >> > >> > Discussed this already to some degree with Chris, but I still think the >> > first cache flush is redundant. >> >> Nope, since it's a partial cacheline write, we need to first flush out >> any stale data, then write the dword (which loads the latest data from >> memory into a cacheline). Then we need to flush the updated cacheline >> out into main memory again. Iirc I've mentioned this somewhere in the >> part 4 of my gem intro. > > The question was whether there could be any stale data in that aliased > cacheline, and whether or not the code was a continuation of a cargo > cult. My answer was that it was our working knowledge that that flush > is crucial to unconfuse the CPU. In the absence of a definitive > reference for the mysteries of clflush, we should write one. Iirc for pwrite, the clflush _before_ a partial cacheline write is crucial for correctness. At least I remember writing tests for it, which successfully blow up if you drop that flush ;-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch