On Thu, Nov 06, 2014 at 10:53:00PM +0200, Ville Syrjälä wrote: > On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote: > > On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote: > > > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote: > > > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> wrote: > > > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000; > > > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached. > > > > > > > > > > MOCS can still be used though. But if userspace is trusting PTE for > > > > > cache selection the safest thing to do is to let caches disabled. > > > > > > > > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry, > > > > > so RTL will always use the value corresponding to pat_sel = 000" > > > > > > > > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576 > > > > > Cc: James Ausmus <james.ausmus@xxxxxxxxx> > > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > > > > > > > > Ok, I think this is -fixes + cc: stable material, but we need to be a > > > > bit more elaborate on the commit message: > > > > > > > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before > > > > this patch, i.e. the same uncached + snooping access like on gen6/7 > > > > seems to be in effect. > > > > - So this just fixes blitter/render access. Again it looks like it's > > > > not just uncached access, but uncached + snooping. So we can still > > > > hold onto all our assumptions wrt cpu clflushing on LLC machines. > > > > > > > > I think this should be both in the commit message and code. > > > > > > > > Chris, please correct if I've summarized this wrongly. > > > > > > Me, I just get confused by the statement that writes through the System > > > Agent to main memory are snooped. > > > > > > But the statement that GTT + WC is coherent with the display as it was > > > on SNB+ is certainly true. > > > > I have a supicious mind so I simply had to go and verify this on my > > IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did > > not result in any cache dirt on the screen. So I must admit defeat. > > > > I also did another experiment where I wrote the data through a cpu > > mmap, which obviously resulted in plenty of cache dirt. But I then > > followed it with a read through the GTT mmap (obj still mapped as > > cacheable in GTT), and the data matched what I wrote, and it also > > cleared the cache dirt from the screen. So any CPU GTT access will > > snoop the caches and also causes a writeback of the cacheline. I > > suspect it also invalidates the cacheline, but I didn't verify that > > in any way. > > > > So I guess I can rest easier now having witnessed it all with mine > > own eyes :) > > I also repeated the same tests on CHV, and it behaves in a manner > consistent with its non-LLC heritage ie. CPU GTT access never snoops > no matter what the PTE says. > > I also frobbed with the PAT entries a bit, and that at least confirms > that status page writes go through PAT[0] even though we set up the > PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get > is a GPU "hang", and with the snoop bit set things work fine. > > So I think that for CHV the ppgtt=0 case is perfectly fine with the > current code because: > 1) CPU GTT access is fine as it never snoops and we expect it not to, > in fact we hand a SIGBUS to anyone who tries this > 2) GPU access to snooped bo will work since PAT[0] says to snoop > 3) GPU access to non-snooped bo will work even if PAT[0] says to snoop, > but will perhaps be a bit slow due to the extra snoops. Should be > able to override with MOCS though and get the speed back. > 4) Status page must be snooped, so things pretty much fall apart when > PAT[0] doesn't have the snoop bit. Well, unless we decide to add > explicit cache flushes to status page reads I think a patch which adds a comment to the chv pat code stating that pat[0] must be cached would be great. Otherwise someone will go and "clean" it up I fear ... > I'm not entirely sure if I should be coserned about the color_adjust > stuff, especially in the case where MOCS is used. We might end up > mixing snooped and non-snooped accesses in a way that the color > adjustment wasn't able to anticipate. I guess I could just try to > see if I can anger the hardware doing that. But that's a job or > another day. tbh I'm not sure any more whether we still need this on vlv/chv - they did change the snooping logic a lot. Iirc pre-gen5 only could snoop with the blitter, vlv/chv can also snoop with the render engine. So maybe this has improved enough that we don't need the color stuff any more. I guess a good test would be to disable the color stuff on vlv/chv and see what happens ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx