On 18/07/2018 14:24, Joonas Lahtinen wrote:
Quoting Tomasz Lis (2018-07-16 16:07:16)
+static int emit_set_data_port_coherency(struct i915_request *rq, bool enable)
+{
+ u32 *cs;
+ i915_reg_t reg;
+
+ GEM_BUG_ON(rq->engine->class != RENDER_CLASS);
+ GEM_BUG_ON(INTEL_GEN(rq->i915) < 9);
+
+ cs = intel_ring_begin(rq, 4);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ if (INTEL_GEN(rq->i915) >= 11)
+ reg = ICL_HDC_MODE;
+ else if (INTEL_GEN(rq->i915) >= 10)
+ reg = CNL_HDC_CHICKEN0;
+ else
+ reg = HDC_CHICKEN0;
+
+ *cs++ = MI_LOAD_REGISTER_IMM(1);
+ *cs++ = i915_mmio_reg_offset(reg);
+ /* Enabling coherency means disabling the bit which forces it off */
This comment is still spurious, please get rid of the habit of writing
comments about "what" the code is doing, useful comments should be
limited to "why", which is quite self explanatory here, that's the way
the register is.
+static int
+intel_lr_context_update_data_port_coherency(struct i915_request *rq)
+{
+ struct i915_gem_context *ctx = rq->gem_context;
+ bool enable = test_bit(CONTEXT_DATA_PORT_COHERENT_REQUESTED, &ctx->flags);
+ int ret;
+
+ lockdep_assert_held(&rq->i915->drm.struct_mutex);
+
+ if (test_bit(CONTEXT_DATA_PORT_COHERENT_ACTIVE, &ctx->flags) == enable)
+ return 0;
+
+ ret = emit_set_data_port_coherency(rq, enable);
+
+ if (!ret) {
+ if (enable)
+ __set_bit(CONTEXT_DATA_PORT_COHERENT_ACTIVE, &ctx->flags);
+ else
+ __clear_bit(CONTEXT_DATA_PORT_COHERENT_ACTIVE, &ctx->flags);
+ }
Do we have indication that the hardware feature will be unreliable in
responding to the requests? I don't think you need the differentiation
of requested vs. active. If there is an error, we can just report back to
the user as a failed IOCTL. Now it adds unnecessary complication for no benefit.
Requested vs active is for implementing the lazy emit.
AFAIR it does propagate the error out of execbuf (although we never ever
expect it to happen), and this is just to keep the internal
house-keeping in sync.
Regards,
Tvrtko
@@ -2164,6 +2221,13 @@ static int gen8_emit_flush_render(struct i915_request *request,
/* WaForGAMHang:kbl */
if (IS_KBL_REVID(request->i915, 0, KBL_REVID_B0))
dc_flush_wa = true;
+
+ /* Emit the switch of data port coherency state if needed */
Ditto for spurious comment, just about what the code does.
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,6 +1456,13 @@ struct drm_i915_gem_context_param {
#define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */
#define I915_CONTEXT_DEFAULT_PRIORITY 0
#define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */
+/*
+ * When data port level coherency is enabled, the GPU will update memory
+ * buffers shared with CPU, by forcing internal cache units to send memory
+ * writes to higher level caches faster. Enabling data port coherency has
+ * a performance cost.
+ */
I was under impression this is enabled by default and it can be disabled
for a performance optimization?
Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx