Zhigang Gong <zhigang.gong@xxxxxxxxxxxxxxx> writes: > On Mon, Jan 05, 2015 at 07:27:26PM +0200, Francisco Jerez wrote: >> Zhigang Gong <zhigang.gong@xxxxxxxxxxxxxxx> writes: >> >> > On Mon, Jan 05, 2015 at 05:03:16AM +0200, Francisco Jerez wrote: >> >> Zhigang Gong <zhigang.gong@xxxxxxxxx> writes: >> >> >> >> > According to bspec, ROW_CHICKEN3's bit 6 which is to disable >> >> > move of cacheable global atomics to L3 is needed for GT3 D >> >> > stepping. >> >> > >> >> > I enabled it and tested it with HSW GT2 D stepping and GT3 E stepping. >> >> > The atomics works fine in beignet. And it could get more than 10x performance >> >> > improvement with some workload, for an example, the "splat" kernel in darktable, >> >> > without this patch, it consumes 50 seconds in one large raw picture processing. >> >> > But with this patch, the same process only takes less than 1 second. >> >> > >> >> >> >> I tried this already (on HSW GT2 D as well) and I don't think it's >> >> enough to get L3 atomics working reliably. Even though they did seem to >> >> work OK at first glance I observed some corruption issues (e.g. atomic >> >> writes not landing in system memory) when doing atomic writes to >> >> contiguous (as in within the same cache-line) locations in memory. The >> >> "unused" ARB_shader_image_load_store test [1] I sent to the Piglit >> >> mailing list some time ago exposes this IIRC, and probably a couple of >> >> other tests too. >> > Ok, I will find that case and have a try on my systems. I just tested all >> > the atomic related OpenCL conformance test cases without any issues. > > I just found the patchset hasn't been accepted by piglit. I took a look at > the source code. And realize that that shader will not work correctly if > mesa doesn't allocate some DC in L3 space. Don't know whether you already > allocated some DC when you did that test. > Of course, and it still fails no matter how you allocate the L3. >> > >> >> >> >> Also this change is going to cause an instant lock-up anytime Mesa uses >> >> atomics because Mesa doesn't change the default L3 way allocation for >> >> the DC, which turns out to be 0 on HSW. >> > >> > This is another issue, IMHO, if the application wants to use atomics, >> > it's better to allocate some L3 space for DC. Otherwise, it could >> > never leverage the "atomics in L3 feature". Based on my test, >> > the performance impact as huge as more than 10x for some workloads. >> > >> Sure, but this change alone will cause a regression (an irrecoverable >> system hang) with current releases of Mesa. >> >> I agree that Mesa should be fixed eventually to assign L3 space to the >> DC for some workloads, but it doesn't seem like we have a satisfactory >> API to do that right now -- The current mechanism used by Beignet >> (poking the L3 control registers from the batch buffer) is slightly >> concerning from a security point of view because it allows an arbitrary >> userspace application to cause misrendering or impact the performance of > I agree with you that the L3 configuration is very dangerous. It's better > to do more checking in the kernel space and make sure the user space > application will not crash the system via L3 configuration. > And, if the kernel provide new API we will use it in beignet ASAP in beignet. > >> other clients (since the L3 config registers are not being saved and >> restored as part of the context) and even crash the whole system. It > I found the L3 config registers are part of the IVB context, but not part > of HSW's. So if two user applications want to use different L3 config, > the user application need to do the config registers' store/restore manually. > Is there a way to do this type of thing gracefully in KMD? The new API > may need to consider the safely context switching regards to L3 config change. > Yeah, relying on userspace saving and restoring the L3 config values sounds rather fragile to me, and may not work if e.g. a context hangs and gets kicked out. IMHO it would be preferrable to keep track of the L3 config and save/restore it from the kernel on context switch. >> also doesn't look like it could be made to work with the hardware >> command checker. > > As to the hardware command checker: > I just found with current nightly kernel, if we boot the kernel with > "i915.enable_ppgtt=2", then we can configure the L3 related registers > in user space. But the command checker still don't allow user space to > chagne L3 config by default. > Ugh, apparently that's because the hardware command checker is not enabled at all in that case... > Thanks, > Zhigang Gong. > >> >> > Thanks, >> > Zhigang Gong. >> > >> >> >> >> [1] http://lists.freedesktop.org/archives/piglit/2014-December/013571.html >> >> >> >> > Signed-off-by: Zhigang Gong <zhigang.gong@xxxxxxxxx> >> >> > --- >> >> > drivers/gpu/drm/i915/intel_pm.c | 10 ++++++---- >> >> > 1 file changed, 6 insertions(+), 4 deletions(-) >> >> > >> >> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c >> >> > index 7d99a9c..8a27802 100644 >> >> > --- a/drivers/gpu/drm/i915/intel_pm.c >> >> > +++ b/drivers/gpu/drm/i915/intel_pm.c >> >> > @@ -5938,10 +5938,12 @@ static void haswell_init_clock_gating(struct drm_device *dev) >> >> > >> >> > ilk_init_lp_watermarks(dev); >> >> > >> >> > - /* L3 caching of data atomics doesn't work -- disable it. */ >> >> > - I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); >> >> > - I915_WRITE(HSW_ROW_CHICKEN3, >> >> > - _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE)); >> >> > + if (IS_HSW_GT3(dev) && dev->pdev->revision <= 6) { >> >> > + /* L3 caching of data atomics doesn't work -- disable it. */ >> >> > + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); >> >> > + I915_WRITE(HSW_ROW_CHICKEN3, >> >> > + _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE)); >> >> > + } >> >> > >> >> > /* This is required by WaCatErrorRejectionIssue:hsw */ >> >> > I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG, >> >> > -- >> >> > 1.8.3.2 >> > >> > >> > >> > >> >> _______________________________________________ >> >> Intel-gfx mailing list >> >> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx >> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Attachment:
pgpSQVtUG2e7z.pgp
Description: PGP signature
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx