On Monday, April 3, 2023 9:48:40 AM PDT Ville Syrjälä wrote: > On Mon, Apr 03, 2023 at 09:35:32AM -0700, Matt Roper wrote: > > On Mon, Apr 03, 2023 at 07:02:08PM +0300, Ville Syrjälä wrote: > > > On Fri, Mar 31, 2023 at 11:38:30PM -0700, fei.yang@xxxxxxxxx wrote: > > > > From: Fei Yang <fei.yang@xxxxxxxxx> > > > > > > > > To comply with the design that buffer objects shall have immutable > > > > cache setting through out its life cycle, {set, get}_caching ioctl's > > > > are no longer supported from MTL onward. With that change caching > > > > policy can only be set at object creation time. The current code > > > > applies a default (platform dependent) cache setting for all objects. > > > > However this is not optimal for performance tuning. The patch extends > > > > the existing gem_create uAPI to let user set PAT index for the object > > > > at creation time. > > > > > > This is missing the whole justification for the new uapi. > > > Why is MOCS not sufficient? > > > > PAT and MOCS are somewhat related, but they're not the same thing. The > > general direction of the hardware architecture recently has been to > > slowly dumb down MOCS and move more of the important memory/cache > > control over to the PAT instead. On current platforms there is some > > overlap (and MOCS has an "ignore PAT" setting that makes the MOCS "win" > > for the specific fields that both can control), but MOCS doesn't have a > > way to express things like snoop/coherency mode (on MTL), or class of > > service (on PVC). And if you check some of the future platforms, the > > hardware design starts packing even more stuff into the PAT (not just > > cache behavior) which will never be handled by MOCS. > > Sigh. So the hardware designers screwed up MOCS yet again and > instead of getting that fixed we are adding a new uapi to work > around it? > > The IMO sane approach (which IIRC was the situation for a few > platform generations at least) is that you just shove the PAT > index into MOCS (or tell it to go look it up from the PTE). > Why the heck did they not just stick with that? There are actually some use cases in newer APIs where MOCS doesn't work well. For example, VK_KHR_buffer_device_address in Vulkan 1.2: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_KHR_buffer_device_address.html It essentially adds "pointers to buffer memory in shaders", where apps can just get a 64-bit pointer, and use it with an address. On our EUs, that turns into A64 data port messages which refer directly to memory. Notably, there's no descriptor (i.e. SURFACE_STATE) where you could stuff a MOCS value. So, you get one single MOCS entry for all such buffers...which is specified in STATE_BASE_ADDRESS. Hope you wanted all of them to have the same cache & coherency settings! With PAT/PTE, we can at least specify settings for each buffer, rather than one global setting. Compression has also been moving towards virtual address-based solutions and handling in the caches and memory controller, rather than in e.g. the sampler reading SURFACE_STATE. (It started evolving that way with Tigerlake, really, but continues.) > > Also keep in mind that MOCS generally applies at the GPU instruction > > level; although a lot of instructions have a field to provide a MOCS > > index, or can use a MOCS already associated with a surface state, there > > are still some that don't. PAT is the source of memory access > > characteristics for anything that can't provide a MOCS directly. > > So what are the things that don't have MOCS and where we need > some custom cache behaviour, and we already know all that at > buffer creation time? For Meteorlake...we have MOCS for cache settings. We only need to use PAT for coherency settings; I believe we can get away with deciding that up-front at buffer creation time. If we were doing full cacheability, I'd be very nervous about deciding performance tuning at creation time. --Ken
Attachment:
signature.asc
Description: This is a digitally signed message part.