Re: [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 28, 2016 at 01:48:27PM +0300, Ville Syrjälä wrote:
> On Thu, Apr 28, 2016 at 10:13:37AM +0200, Daniel Vetter wrote:
> > On Tue, Apr 26, 2016 at 08:57:51PM +0300, Ville Syrjälä wrote:
> > > On Tue, Apr 26, 2016 at 04:30:05PM +0200, Daniel Vetter wrote:
> > > > On Tue, Apr 26, 2016 at 05:26:43PM +0300, Eero Tamminen wrote:
> > > > > Hi,
> > > > > 
> > > > > On 26.04.2016 16:23, Chris Wilson wrote:
> > > > > >On Tue, Apr 26, 2016 at 04:17:55PM +0300, Imre Deak wrote:
> > > > > >>On ti, 2016-04-26 at 13:57 +0100, Chris Wilson wrote:
> > > > > >>>On Tue, Apr 26, 2016 at 03:44:22PM +0300, Imre Deak wrote:
> > > > > >>>>Setting a write-back cache policy in the MOCS entry definition also
> > > > > >>>>implies snooping, which has a considerable overhead. This is
> > > > > >>>>unexpected for a few reasons:
> > > > > >>>
> > > > > >>>If it is snooping, then I don't see why it is undesirable to have it
> > > > > >>>available in a mocs setting. If it is bogus and the bit is undefined,
> > > > > >>>then by all means remove it.
> > > > > >>
> > > > > >>None of these entries are used alone for coherent surfaces. For that
> > > > > >>the application would have to use entry index#1 or #2 _and_ call the
> > > > > >>set caching IOCTL to set the corresponding buffer to be cached.
> > > > > >
> > > > > >No, the application doesn't. There are sufficent interfaces exposed that
> > > > > >userspace can bypass the kernel if it so desired.
> > > > > >
> > > > > >>The
> > > > > >>problem is that without setting the buffer to be cacheable the
> > > > > >>expectation is that we won't be snooping and incur the corresponding
> > > > > >>overhead. This is what this patch addresses.
> > > > > >
> > > > > >Not true.
> > > > > >
> > > > > >>The bit is also bogus, if we wanted snooping via MOCS we'd use the
> > > > > >>dedicated HW flag for that.
> > > > > >
> > > > > >But you keep saying this bit *enables* snooping. So either it does or it
> > > > > >doesn't.
> > > > > >
> > > > > >>If we wanted to have a snooping MOCS entry we should add that
> > > > > >>separately (as a forth entry), but we'd need this change as a fix for
> > > > > >>current users.
> > > > > >
> > > > > >The current users who are getting what they request but don't know what
> > > > > >they were requesting?
> > > > > 
> > > > > What this kernel ABI (index entry #2) has been agreed & documented to
> > > > > provide?
> > > > > 
> > > > > I thought this entry is supposed to replace the writeback LLC/eLLC cache
> > > > > MOCS setting Mesa is using on (e.g. BDW) to speed up accesses to a memory
> > > > > area which it knows always to be accessed so that it can be cached.
> > > > > 
> > > > > If app runs on HW where LLC/eLLC is missing, giving the app extra slowdown
> > > > > instead of potential speedup sounds like failed HW abstraction. :-)
> > > > 
> > > > Well mesa needs to know llc vs. !llc anyway to not totally suck, and
> > > > defining entry #2 as "coherent, always" makes sense. I thought entry 0 was
> > > > the reaonable default aka pte passthrough and hence managed by kernel?
> > > 
> > > Nope, we fscked that up somewhat, and entry 1 is the PTE one :( So if
> > > userspace forgets to set MOCS on gen9 it won't get the behaviour it
> > > would have gotten on previous gens.
> > 
> > How do we manage to pass the various set_caching vs. rendercpy test then
> > on skl? Or do we just not, and everyone is still lalala?
> 
> Why wouldn't it pass? It's all coherent anyway. Although apparently the
> docs have some confusing comments that things wouldn't be coherent unless
> you enable LLC caching, but maybe that's just about coherent L3 or
> something?

We have at least one that uses cache control by the kernel and makes sure
(using crcs) that the stuff actually lands. If we accidentally do coherent
access instead of wc cache dirt might end up in cpu caches. At least it
seems pretty fishy.

> > Sounds like something to fix either way.
> > 
> > > > If mesa asks for nonsense, the kernel is happy to oblige.
> > > 
> > > We never really defined what entry 2 actually means: coherent or sane
> > > performance. Mesa has, rightfully IMO, made the assumption that it means
> > > the latter since we never set out to define any MOCS entries with
> > > coherency in mind. And seeing that it's already out in the wild, I think
> > > it's better to respect it. If we change it now we would just make it
> > > more painful for people when they get their hands on the hardware.
> > > 
> > > I think what we should do is define what the MOCS indexes mean in some
> > > uapi header. Then there would be no ambiguity.
> > 
> > Problem with encoding this is that sooner or later we're playing a game of
> > whack-a-mole where kernel second-guesses mesa to support old versions, and
> > mesa tries to work around kernel assumptions that no longer fit reworked
> > code. "It's the easiest solution for existing binaries" is imo very poor
> > justification for ABI, especially when it's just about performance.
> > Fundamentally mesa still works correctly, just a bit slow.
> > 
> > I don't want to merge a random hack just because.
> 
> I don't see it as any kind of hack. That's the ABI we agreed on pretty
> much in my opinion. No one had any concerns about coherency when the
> current MOCS entries were allocated IIRC.

Hm, not being concerned about coherency when discussing MOCS sounds like a
pretty big review oversight.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux