Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote: >> Daniel Vetter <daniel@xxxxxxxx> writes: >> >> > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: >> >> The WaDisableLSQCROPERFforOCL workaround has the side effect of >> >> disabling an L3SQ optimization that has huge performance implications >> >> and is unlikely to be necessary for the correct functioning of usual >> >> graphic workloads. Userspace is free to re-enable the workaround on >> >> demand, and is generally in a better position to determine whether the >> >> workaround is necessary than the DRM is (e.g. only during the >> >> execution of compute kernels that rely on both L3 fences and HDC R/W >> >> requests). >> >> >> >> The same workaround seems to apply to BDW (at least to production >> >> stepping G1) and SKL as well (the internal workaround database claims >> >> that it does for all steppings, while the BSpec workaround table only >> >> mentions pre-production steppings), but the DRM doesn't do anything >> >> beyond whitelisting the L3SQCREG4 register so userspace can enable it >> >> when it sees fit. Do the same on KBL platforms. >> >> >> >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, >> >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- >> >> This is followed by a regression of 35% and 10% respectively for the >> >> same benchmarks and platform caused by my recent patch series >> >> switching userspace to use the dataport constant cache instead of the >> >> sampler to implement uniform pull constant loads, which caused us to >> >> hit more heavily the L3 cache (and on platforms other than KBL had the >> >> opposite effect of improving performance of the same two benchmarks). >> >> The overall effect on KBL of this change combined with the recent >> >> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf >> >> was affected by the constant cache changes (though it improved as it >> >> did on other platforms rather than regressing), but is not >> >> significantly affected by this patch (with statistical significance of >> >> 5% and sample size 20). >> >> >> >> v2: Drop some more code to avoid unused variable warning. >> >> >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 >> >> Signed-off-by: Francisco Jerez <currojerez@xxxxxxxxxx> >> >> Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx> >> >> Cc: Jani Nikula <jani.nikula@xxxxxxxxx> >> >> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> >> >> Cc: beignet@xxxxxxxxxxxxxxxxxxxxx >> > >> > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom >> > for compute kernels? Are the patches for mesa compute/beignet >> > ready&reviewed? >> >> This is explicit setting on kbl/E0 only. So one could argue >> that unless they filter based on PCI-IDs, things would already >> blow up across the skl/kbl population, if they forgot >> to set it. The whitelisting is in place and looks sane >> so this E0 exception is a wart that got in by me reading wa >> database slavishly without thinking. > > Add Fixes then? Fixes: a4106a782d11 ("drm/i915/gen9: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround") Looking at beignet source, they don't care about this register/bit (yet). Also we need to get rid of KBL_REVID_E0 as there is no such thing. Oddly kbl doesnt follow the logical x0->rev mapping but leave holes. Were they afraid of running out of revids or what... -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx