Zhigang Gong <zhigang.gong@xxxxxxxxx> writes: > According to bspec, ROW_CHICKEN3's bit 6 which is to disable > move of cacheable global atomics to L3 is needed for GT3 D > stepping. > > I enabled it and tested it with HSW GT2 D stepping and GT3 E stepping. > The atomics works fine in beignet. And it could get more than 10x performance > improvement with some workload, for an example, the "splat" kernel in darktable, > without this patch, it consumes 50 seconds in one large raw picture processing. > But with this patch, the same process only takes less than 1 second. > I tried this already (on HSW GT2 D as well) and I don't think it's enough to get L3 atomics working reliably. Even though they did seem to work OK at first glance I observed some corruption issues (e.g. atomic writes not landing in system memory) when doing atomic writes to contiguous (as in within the same cache-line) locations in memory. The "unused" ARB_shader_image_load_store test [1] I sent to the Piglit mailing list some time ago exposes this IIRC, and probably a couple of other tests too. Also this change is going to cause an instant lock-up anytime Mesa uses atomics because Mesa doesn't change the default L3 way allocation for the DC, which turns out to be 0 on HSW. [1] http://lists.freedesktop.org/archives/piglit/2014-December/013571.html > Signed-off-by: Zhigang Gong <zhigang.gong@xxxxxxxxx> > --- > drivers/gpu/drm/i915/intel_pm.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 7d99a9c..8a27802 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -5938,10 +5938,12 @@ static void haswell_init_clock_gating(struct drm_device *dev) > > ilk_init_lp_watermarks(dev); > > - /* L3 caching of data atomics doesn't work -- disable it. */ > - I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); > - I915_WRITE(HSW_ROW_CHICKEN3, > - _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE)); > + if (IS_HSW_GT3(dev) && dev->pdev->revision <= 6) { > + /* L3 caching of data atomics doesn't work -- disable it. */ > + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); > + I915_WRITE(HSW_ROW_CHICKEN3, > + _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE)); > + } > > /* This is required by WaCatErrorRejectionIssue:hsw */ > I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG, > -- > 1.8.3.2
Attachment:
pgpnN0gQdynDF.pgp
Description: PGP signature
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx