On Tue, May 26, 2015 at 09:49:15AM +0100, Chris Wilson wrote: > On Tue, May 26, 2015 at 10:01:24AM +0200, Daniel Vetter wrote: > > On Mon, May 25, 2015 at 06:48:44PM +0100, Chris Wilson wrote: > > > Since the advent of mmap(wc), where we reused the same cache domain for > > > WC and GTT paths (oh, how I regret that double-edged advice), we need to > > > be extra cautious when using GTT iomap_wc internally. Since userspace maybe > > > modifying through the mmap(wc) and we then modify modifying through an > > > aliased WC path through the GTT, those writes may overlap and not be > > > visible to the other path. Easiest to trigger appears to be write the > > > batch through mmap(wc) and then attempt to perform reloc the GTT, > > > corruption quickly ensues. > > > > > > v2: Be stricter and do a full mb() in case we are reading through one > > > path and about to write through the second path. Also, apply the > > > barriers on transitioning into the GTT domain as well to cover the white > > > lie asychronous cases. > > > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Cc: Akash Goel <akash.goel@xxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx > > > > There are a lot more mb() and wmb() in the gem code. I count execlist, > > legacy rings, pwrite, set_domain and finish_gtt. > > You have more than I have... Well yeah, but the patch is also cc: stable. Which means we'd need to frob the barriers first, then refactor. And I'm still not sure whether this is road we want to go down, or whether adding explicit begin/end coherency ioctls wouldn't be the better long-term option. It always irked me a bit that the kernel needs to infer the other edge of a cpu access, explicit begin/end seemes better (and more in line with what all the gpu apis expose). Something like this struct i915_mmap_control { #define BEGIN 0 #define END 1 #define READ 0 #define WRITE 2 #define UNSYNCHRONIZED 4 /* or whetever the bikeshed will be */ #define MAPPING_MASK 0xf0 #define GTT_WC (1 << 4) #define CPU_WC (2 << 4) #define CPU_CACHED (3 << 4) u32 flags; u32 bo; u64 start; u64 length; }; Pretty much modelled after the begin/end_cpu_access hooks from dma-buf. Which seems to gain some traction for a generic up/download interface ... It just feels like incrementally fixing things might lead us to a very bad place. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html