On 2018-07-18 16:42, Tvrtko Ursulin wrote:
On 18/07/2018 14:24, Joonas Lahtinen wrote:
Quoting Tomasz Lis (2018-07-16 16:07:16)
+static int emit_set_data_port_coherency(struct i915_request *rq,
bool enable)
+{
+ u32 *cs;
+ i915_reg_t reg;
+
+ GEM_BUG_ON(rq->engine->class != RENDER_CLASS);
+ GEM_BUG_ON(INTEL_GEN(rq->i915) < 9);
+
+ cs = intel_ring_begin(rq, 4);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ if (INTEL_GEN(rq->i915) >= 11)
+ reg = ICL_HDC_MODE;
+ else if (INTEL_GEN(rq->i915) >= 10)
+ reg = CNL_HDC_CHICKEN0;
+ else
+ reg = HDC_CHICKEN0;
+
+ *cs++ = MI_LOAD_REGISTER_IMM(1);
+ *cs++ = i915_mmio_reg_offset(reg);
+ /* Enabling coherency means disabling the bit which forces
it off */
This comment is still spurious, please get rid of the habit of writing
comments about "what" the code is doing, useful comments should be
limited to "why", which is quite self explanatory here, that's the way
the register is.
Ok, I will read the related doc:
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#commenting
+static int
+intel_lr_context_update_data_port_coherency(struct i915_request *rq)
+{
+ struct i915_gem_context *ctx = rq->gem_context;
+ bool enable = test_bit(CONTEXT_DATA_PORT_COHERENT_REQUESTED,
&ctx->flags);
+ int ret;
+
+ lockdep_assert_held(&rq->i915->drm.struct_mutex);
+
+ if (test_bit(CONTEXT_DATA_PORT_COHERENT_ACTIVE, &ctx->flags)
== enable)
+ return 0;
+
+ ret = emit_set_data_port_coherency(rq, enable);
+
+ if (!ret) {
+ if (enable)
+ __set_bit(CONTEXT_DATA_PORT_COHERENT_ACTIVE, &ctx->flags);
+ else
+ __clear_bit(CONTEXT_DATA_PORT_COHERENT_ACTIVE, &ctx->flags);
+ }
Do we have indication that the hardware feature will be unreliable in
responding to the requests? I don't think you need the differentiation
of requested vs. active. If there is an error, we can just report
back to
the user as a failed IOCTL. Now it adds unnecessary complication for
no benefit.
Requested vs active is for implementing the lazy emit.
AFAIR it does propagate the error out of execbuf (although we never
ever expect it to happen), and this is just to keep the internal
house-keeping in sync.
Regards,
Tvrtko
@@ -2164,6 +2221,13 @@ static int gen8_emit_flush_render(struct
i915_request *request,
/* WaForGAMHang:kbl */
if (IS_KBL_REVID(request->i915, 0, KBL_REVID_B0))
dc_flush_wa = true;
+
+ /* Emit the switch of data port coherency state if
needed */
Ditto for spurious comment, just about what the code does.
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,6 +1456,13 @@ struct drm_i915_gem_context_param {
#define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */
#define I915_CONTEXT_DEFAULT_PRIORITY 0
#define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */
+/*
+ * When data port level coherency is enabled, the GPU will update
memory
+ * buffers shared with CPU, by forcing internal cache units to send
memory
+ * writes to higher level caches faster. Enabling data port
coherency has
+ * a performance cost.
+ */
I was under impression this is enabled by default and it can be disabled
for a performance optimization?
This is true, coherency is kept by default. We disable it as a
workaround: performance-related for gen11, and due to minor hardware
issue on previous platforms. See WaForceEnableNonCoherent.
-Tomasz
Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx