Re: [PATCH 07/10] drm/i915: Gate engine stats collection with a static key

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Wed, 4 Oct 2017 18:38:09 +0100

On 03/10/2017 11:17, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2017-09-29 13:34:57)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

This reduces the cost of the software engine busyness tracking
to a single no-op instruction when there are no listeners.

We add a new i915 ordered workqueue to be used only for tasks
not needing struct mutex.

v2: Rebase and some comments.
v3: Rebase.
v4: Checkpatch fixes.
v5: Rebase.
v6: Use system_long_wq to avoid being blocked by struct_mutex
     users.
v7: Fix bad conflict resolution from last rebase. (Dmitry Rogozhkin)
v8: Rebase.
v9:
  * Fix race between unordered enable followed by disable.
    (Chris Wilson)
  * Prettify order of local variable declarations. (Chris Wilson)

Ok, I can't see a downside to enabling the optimisation even if it will
be global and not per-device/per-engine.

For this one I did a quick test with gem_exec_nop and I've seen around 
0.5% reduction in time spend in intel_lrc_irq_handler in the case where 
PMU is not active.

So it is a bit underwhelming and unless I can get different results 
after re-measuring a few times, I'd say it is not worth the complication 
of putting this in. At least it is there in history so it can be pulled 
in if needed.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx