-----Original Message----- From: Lespiau, Damien Sent: Thursday, June 18, 2015 2:51 PM To: Antoine, Peter Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; daniel.vetter.intel.com@xxxxxxxxxxxxxxxxxxxxxxxxxxxx; chris@xxxxxxxxxxxxxxxxxx; mattst88@xxxxxxxxx Subject: Re: [PATCH v5] drm/i915 : Added Programming of the MOCS On Thu, Jun 18, 2015 at 01:29:45PM +0100, Peter Antoine wrote: > @@ -1379,6 +1380,13 @@ static int gen8_init_rcs_context(struct intel_engine_cs *ring, > if (ret) > return ret; > > + /* > + * Failing to program the MOCS is non-fatal.The system will not > + * run at peak performance. So generate a warning and carry on. > + */ > + if (intel_rcs_context_init_mocs(ring, ctx) != 0) > + DRM_ERROR("MOCS failed to program: expect performance issues."); > + Missing a '\n'. Will fix. > +static const struct drm_i915_mocs_entry skylake_mocs_table[] = { > + /* {0x00000009, 0x0010} */ > + {(MOCS_CACHEABILITY(EDRAM_UC) | MOCS_TGT_CACHE(LLC_ELLC) | > + MOCS_LRUM(0) | MOCS_AOM(0) | MOCS_LECC_ESC(0) | MOCS_SCC(0) | > + MOC_PFM(0) | MOCS_SCF(0)), > + (MOCS_ESC(0) | MOCS_SCC(0) | MOCS_L3_CACHEABILITY(L3_UC))}, > + /* {0x0000003b, 0x0030} */ We're still missing the usage hints for those configuration entries That'd help user space a lot, which means make this patch land quicker as well. These are boiled down from 250+ requirements from different usecases (opencl, Media, etc...), I can't really generate anymore usage hints. > +int intel_rcs_context_init_mocs(struct intel_engine_cs *ring, > + struct intel_context *ctx) > +{ > + int ret = 0; > + > + struct drm_i915_mocs_table t; > + struct drm_device *dev = ring->dev; > + struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; > + > + if (get_mocs_settings(dev, &t)) { > + u32 table_size; > + > + /* > + * OK. For each supported ring: > + * number of mocs entries * 2 dwords for each control_value > + * plus number of mocs entries /2 dwords for l3cc values. > + * > + * Plus 1 for the load command and 1 for the NOOP per ring > + * and the l3cc programming. > + */ > + table_size = GEN9_NUM_MOCS_RINGS * > + ((2 * GEN9_NUM_MOCS_ENTRIES) + 2) + > + GEN9_NUM_MOCS_ENTRIES + 2; > + ret = intel_logical_ring_begin(ringbuf, ctx, table_size); > + if (ret) { > + DRM_DEBUG("intel_logical_ring_begin failed %d\n", ret); > + return ret; > + } > + > + /* program the control registers */ > + emit_mocs_control_table(ringbuf, &t, GEN9_GFX_MOCS_0); > + emit_mocs_control_table(ringbuf, &t, GEN9_MFX0_MOCS_0); > + emit_mocs_control_table(ringbuf, &t, GEN9_MFX1_MOCS_0); > + emit_mocs_control_table(ringbuf, &t, GEN9_VEBOX_MOCS_0); > + emit_mocs_control_table(ringbuf, &t, GEN9_BLT_MOCS_0); So, if I'm not mistaken, I think this only works because we fully initialize the default context at start/reset time through: + i915_gem_init_hw() + i915_gem_context_enable() + cycle through all the rings and call ring->init_context() + gen8_init_rcs_context() + intel_rcs_context_init_mocs() (initalize ALL the MOCS!) Yes. So, intializing the other (non-render) MOCS in gen8_init_rcs_context() isn't the most logical thing to do I'm afraid. What happens if we suddenly decide that we don't want to fully initialize the default context at startup but initialize each ring on-demand for that context as well? We can end up in a situation where we use the blitter first and we wouldn't have the blitter MOCS initialized. In that sense, that code makes an assumption about how we do things in a completely different part of the driver and that's always a potential source of bugs. Yes, but this is the same with the golden context and the workarounds (as I understand it) so all this code would have to be moved. Chris, how far am I ? :p One way to "solve" this (if that's indeed the issue pointed at by Chris) would be to decouple the render MOCS from the others, still keep the render ones in there as they need to be emitted from the ring but put the other writes (which could be done through MMIO as well) higher in the chain, could probably make sense in i915_gem_context_enable()? (which, by the way is awfully namedm should have an _init somewhere?). It could also be a per-ring vfunc I suppose. For similar reasons, I think the GuC MOCS should be part of the GuC init as well so we don't couple too hard different part of the code. Now, is that really a blocker? I'd say no if we had userspace ready and could commit that today, because we really want it. Still something to look at, I could be totally wrong. Not a blocker. It gets a little more interesting, as the L3CC registers are shared across all engines, but is only saved in the RCS context. But, it is reset on the context switch when ELSP is set. So we would have to program it (i.e. MMIO) and also set it in the batch start for the RCS. Each ring would have to have a proper init_context() and these registers programmed there. The separate header for a single function isn't something we usually do either, but that can always be folded in later. Yep, I agree that is overkill. -- Damien _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx