On Mon, 26 Mar 2012 11:26:33 +0200, Daniel Vetter <daniel at ffwll.ch> wrote: > On Mon, Mar 26, 2012 at 10:18:39AM +0100, Chris Wilson wrote: > > On Sun, 25 Mar 2012 19:47:42 +0200, Daniel Vetter <daniel.vetter at ffwll.ch> wrote: > > > The issue is that with inline clflushing the clflushing isn't properly > > > swizzled. Fix this by > > > - always clflushing entire 128 byte chunks and > > > - unconditionally flush before writes when swizzling a given page. > > > We could be clever and check whether we pwrite a partial 128 byte > > > chunk instead of a partial cacheline, but I've figured that's not > > > worth it. > > > > There's some black magic here that I haven't fully grasped. We only ever > > swizzle the gpu address (by whole cachelines), so why do we need to > > invalidate a pair of cachelines for a single cacheline write? > > Well, we do swizzle when doing the actual copy_to|from_user, so strictly > speaking we should also swizzle the clflushing in this case. No bit17 > swizzling pwrite/pread is pretty much only around for backwards-compat > with dead-old userspace, so I've figure I'll just unconditionally align > the clflush range with even cachelines when bit17 swizzling is effective > on the current page. Instead of adding a complex and rather untested > swizzled clflush helper. I was struggling to see where exactly we were swizzling the CPU address because I failed to make the connection between shmem_page_offset and gpu_offset. With that resolved, Daniel you deserve a special award for that!, Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk> -Chris -- Chris Wilson, Intel Open Source Technology Centre