On Thu, Nov 6, 2014 at 12:53 PM, Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> wrote: > On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote: >> On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote: >> > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote: >> > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> wrote: >> > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000; >> > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached. >> > > > >> > > > MOCS can still be used though. But if userspace is trusting PTE for >> > > > cache selection the safest thing to do is to let caches disabled. >> > > > >> > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry, >> > > > so RTL will always use the value corresponding to pat_sel = 000" >> > > > >> > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576 >> > > > Cc: James Ausmus <james.ausmus@xxxxxxxxx> >> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> >> > > >> > > Ok, I think this is -fixes + cc: stable material, but we need to be a >> > > bit more elaborate on the commit message: >> > > >> > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before >> > > this patch, i.e. the same uncached + snooping access like on gen6/7 >> > > seems to be in effect. >> > > - So this just fixes blitter/render access. Again it looks like it's >> > > not just uncached access, but uncached + snooping. So we can still >> > > hold onto all our assumptions wrt cpu clflushing on LLC machines. >> > > >> > > I think this should be both in the commit message and code. >> > > >> > > Chris, please correct if I've summarized this wrongly. >> > >> > Me, I just get confused by the statement that writes through the System >> > Agent to main memory are snooped. >> > >> > But the statement that GTT + WC is coherent with the display as it was >> > on SNB+ is certainly true. >> >> I have a supicious mind so I simply had to go and verify this on my >> IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did >> not result in any cache dirt on the screen. So I must admit defeat. >> >> I also did another experiment where I wrote the data through a cpu >> mmap, which obviously resulted in plenty of cache dirt. But I then >> followed it with a read through the GTT mmap (obj still mapped as >> cacheable in GTT), and the data matched what I wrote, and it also >> cleared the cache dirt from the screen. So any CPU GTT access will >> snoop the caches and also causes a writeback of the cacheline. I >> suspect it also invalidates the cacheline, but I didn't verify that >> in any way. >> >> So I guess I can rest easier now having witnessed it all with mine >> own eyes :) > > I also repeated the same tests on CHV, and it behaves in a manner > consistent with its non-LLC heritage ie. CPU GTT access never snoops > no matter what the PTE says. > > I also frobbed with the PAT entries a bit, and that at least confirms > that status page writes go through PAT[0] even though we set up the > PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get > is a GPU "hang", and with the snoop bit set things work fine. > > So I think that for CHV the ppgtt=0 case is perfectly fine with the > current code because: > 1) CPU GTT access is fine as it never snoops and we expect it not to, > in fact we hand a SIGBUS to anyone who tries this > 2) GPU access to snooped bo will work since PAT[0] says to snoop > 3) GPU access to non-snooped bo will work even if PAT[0] says to snoop, > but will perhaps be a bit slow due to the extra snoops. Should be > able to override with MOCS though and get the speed back. > 4) Status page must be snooped, so things pretty much fall apart when > PAT[0] doesn't have the snoop bit. Well, unless we decide to add > explicit cache flushes to status page reads Great job! Thanks for verifying that deeply. I wasn't really sure about how we are setting the entire PAT table, but with your tests I'm confident we are in the right direction. > > I'm not entirely sure if I should be coserned about the color_adjust > stuff, especially in the case where MOCS is used. We might end up > mixing snooped and non-snooped accesses in a way that the color > adjustment wasn't able to anticipate. uhm... interesting point. > I guess I could just try to > see if I can anger the hardware doing that. But that's a job or > another day. indeed! :) > > -- > Ville Syrjälä > Intel OTC > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Rodrigo Vivi Blog: http://blog.vivi.eng.br _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx