Setting a write-back cache policy in the MOCS entry definition also implies snooping, which has a considerable overhead. This is unexpected for a few reasons: - From user-space's point of view since it didn't want a coherent surface (it didn't set the buffer as such via the set caching IOCTL). - There is a separate MOCS entry field for snooping (which we never set). - This MOCS table is about caching in (e)LLC and there is no (e)LLC on BXT. There is a separate table for L3 cache control. Considering the above the current behavior of snooping looks like an unintentional side-effect of the WB setting. Changing it to be PTE based cacheability gets rid of the snooping without any ill-effects. For a coherent surface the application would use a separate MOCS entry (at index 1) and call the set caching IOCTL to setup the PTE entries for the corresponding buffer to be snooped. This resulted in 70% improvement in synthetic texturing benchmarks. Kudos to Valtteri Rantala, Eero Tamminen and Michael T Frederick and Ville who helped to narrow the source of problem to the kernel and to the snooping behaviour in particular. With a follow-up change to adjust the 3rd entry value igt/gem_mocs_settings is passing after this change. v2: - Rebase on v2 of patch 1/2. CC: Valtteri Rantala <valtteri.rantala@xxxxxxxxx> CC: Eero Tamminen <eero.t.tamminen@xxxxxxxxx> CC: Michael T Frederick <michael.t.frederick@xxxxxxxxx> CC: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> CC: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Signed-off-by: Imre Deak <imre.deak@xxxxxxxxx> --- drivers/gpu/drm/i915/intel_mocs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c index f391ad6..d38f65e 100644 --- a/drivers/gpu/drm/i915/intel_mocs.c +++ b/drivers/gpu/drm/i915/intel_mocs.c @@ -149,8 +149,8 @@ static const struct drm_i915_mocs_entry broxton_mocs_table[] = { .l3cc_value = L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_WB), }, { - /* 0x0000003b */ - .control_value = LE_CACHEABILITY(LE_WB) | + /* 0x00000038 */ + .control_value = LE_CACHEABILITY(LE_PAGETABLE) | LE_TGT_CACHE(LE_TC_LLC_ELLC) | LE_LRUM(3) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) | LE_PFM(0) | LE_SCF(0), -- 2.5.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx