>>>> >>>> static u64 mtl_pte_encode(dma_addr_t addr, >>>> - enum i915_cache_level level, >>>> + unsigned int pat_index, >>>> u32 flags) >>> Prototype and implementation changed here for mtl_pte_encode. >>> >>> And we have: >>> >>> if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70)) >>> ppgtt->vm.pte_encode = mtl_pte_encode; >>> else >>> ppgtt->vm.pte_encode = gen8_pte_encode; >>> So should be same prototype. And: >>> >>> u64 (*pte_encode)(dma_addr_t addr, >>>- enum i915_cache_level level, >>>+ unsigned int pat_index, >>> u32 flags); >>>/* Create a valid PTE */ >>> Patch relies on the compiler considering enum equal to unsigned int? >> >> yes, caller is passing in unsigned int and gets used as enum. >> >>> But the implementation of gen8_pte_encode and most ggtt >>> counterparts is looking at the passed in pat index and thinks it is cache level. >>> >>> How is that supposed to work?! Or I am blind and am missing something? >> >> For legacy platforms translation through LEGACY_CACHELEVEL would not >> change the value of cache_level. The cache_level and pat_index are >> basically the same for these platforms. > > Oh that is nasty little trick! And I did not spot it being described > anywhere in the commit message or code comments. Will add. > So you are saying for legacy cache_level equals pat_index for what > caching behaviour is concerned. Ie: > > +#define LEGACY_CACHELEVEL \ > + .cachelevel_to_pat = { \ > + [I915_CACHE_NONE] = 0, \ > + [I915_CACHE_LLC] = 1, \ > + [I915_CACHE_L3_LLC] = 2, \ > + [I915_CACHE_WT] = 3, \ > + } > > And because: > > enum i915_cache_level { > I915_CACHE_NONE = 0, > I915_CACHE_LLC, > I915_CACHE_L3_LLC, > I915_CACHE_WT, > }; > > This indeed ends up a 1:1 reversible mapping. > > But it is hidden and fragile. What prevents someone from changing the > enum i915_cache_level? There is no explicit linkage with hardware PAT > values anywhere. Or at least I don't see it. > > I would say all pte_encode signatures have to be changed to pat index. > > Which means all pte encode implementations have to understand what pat indices mean. > > Which brings us back to that idea of a 2nd table, I paraphrase: > > .legacy_pat_to_cache = { > [0] = I915_PAT_UC, > [1] = I915_PAT_WB, > [2] = I915_PAT_WB | I915_PAT_LLC /* not sure on this one */ > [3] = I915_PAT_WT, > }; > > Pat_encode implementations then instead: > > switch (level) { > case I915_CACHE_NONE: > pte |= PPAT_UNCACHED; > ... > > Do: > > if (i915->pat_to_cache_flags[pat_index] & I915_PAT_UC) > pte |= PPAT_UNCACHED; > else if > ... > > > But it would require i915 to be passed in which is admittedly a > noisy diff. Hm.. benefit of hardware agnostic enum i915_cache_level.. That's was one of the problem I ran into when creating this patch. > Maybe convert pat_index to I915_PAT_.. flags in the callers? Like this: > > gen8_ppgtt_insert_pte(...) > ... > const u32 pat_flags = i915->pat_to_cache_flags[pat_index]; > const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, pat_flags, flags); > > Etc. That would be smaller churn on the pte_encode signature. > > Maybe helper for i915->pat_to_cache_flags lookup so it can check the array bounds? > > If this all sounds too much to you maybe we can do it as a followup. It's getting more complicated... But I believe it should be doable to define a PAT table (same in Bspec) and drop the cache_level. Together with the PPAT bit masks (such as MTL_PPAT_L4_3_UC) defined in intel_gtt.h, the code can be simpler without translation. > Or perhaps it is actually pointing towards that obj->pat_index is not the most > elegant choice to be used as a single point of truth.. perhaps obj->cache_flags > would be better. It would be set at same entry points and it would be hw agnostic > so could end up more elegant in the driver. > > But then I think we need at minimum something like the below in this patch, somewhere: > > /* > * On pre-Gen12 platforms enum i915_cache_level happens to align > * with caching modes as specified in hardware PAT indices. Our > * implementation relies on that due tricks played (explain the > * tricks) in the pte_encode vfuncs. > * Ensure this trick keeps working until the driver can be fully > * refactored to support pat indices better. > */ > BUILD_BUG_ON(I915_CACHE_NONE != 0); > ... etc for all enums ... Will add > if (gen < 12) { > GEM_WARN_ON(i915_gem_get_pat_index(i915, I915_CACHE_NONE) != 0); > ... etc for all enums ... This should not be necessary as long as the enum is verified. > } > >> It is broken for gen12 here. I was asked to separate the >> gen12_pte_encode change to another patch in the series, but that >> breaks bisect. Should I squash 2/5 and 3/5? > > This patch breaks gen12? Yes that should definitely be avoided. Will fix that by squashing 2/5 and 3/5 with explanation in commit message. > Regards, > > Tvrtko