Re: [Intel-gfx] [PATCH v2] drm/i915: Force wmb() on using GTT io_mapping_map_wc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 26, 2015 at 09:49:15AM +0100, Chris Wilson wrote:
> On Tue, May 26, 2015 at 10:01:24AM +0200, Daniel Vetter wrote:
> > On Mon, May 25, 2015 at 06:48:44PM +0100, Chris Wilson wrote:
> > > Since the advent of mmap(wc), where we reused the same cache domain for
> > > WC and GTT paths (oh, how I regret that double-edged advice), we need to
> > > be extra cautious when using GTT iomap_wc internally. Since userspace maybe
> > > modifying through the mmap(wc) and we then modify modifying through an
> > > aliased WC path through the GTT, those writes may overlap and not be
> > > visible to the other path. Easiest to trigger appears to be write the
> > > batch through mmap(wc) and then attempt to perform reloc the GTT,
> > > corruption quickly ensues.
> > > 
> > > v2: Be stricter and do a full mb() in case we are reading through one
> > > path and about to write through the second path. Also, apply the
> > > barriers on transitioning into the GTT domain as well to cover the white
> > > lie asychronous cases.
> > > 
> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > > Cc: Akash Goel <akash.goel@xxxxxxxxx>
> > > Cc: stable@xxxxxxxxxxxxxxx
> > 
> > There are a lot more mb() and wmb() in the gem code. I count execlist,
> > legacy rings, pwrite, set_domain and finish_gtt.
> 
> You have more than I have...

Well yeah, but the patch is also cc: stable. Which means we'd need to frob
the barriers first, then refactor. And I'm still not sure whether this is
road we want to go down, or whether adding explicit begin/end coherency
ioctls wouldn't be the better long-term option. It always irked me a bit
that the kernel needs to infer the other edge of a cpu access, explicit
begin/end seemes better (and more in line with what all the gpu apis
expose). Something like this

struct i915_mmap_control {
#define BEGIN 0
#define END 1
#define READ 0
#define WRITE 2
#define UNSYNCHRONIZED 	4 /* or whetever the bikeshed will be */
#define MAPPING_MASK 0xf0
#define GTT_WC		(1 << 4)
#define CPU_WC		(2 << 4)
#define CPU_CACHED	(3 << 4)

	u32 flags;
	u32 bo;
	u64 start;
	u64 length;
};

Pretty much modelled after the begin/end_cpu_access hooks from dma-buf.
Which seems to gain some traction for a generic up/download interface ...

It just feels like incrementally fixing things might lead us to a very bad
place.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]