On 11/22/2017 3:06 PM, Chris Wilson wrote:
Quoting Sagar Arun Kamble (2017-11-22 07:41:02)
On 11/22/2017 2:29 AM, Chris Wilson wrote:
Instead of sleeping for a fixed 1ms (roughly, depending on timer slack),
start with a small sleep and exponentially increase the sleep on each
cycle.
A good example of a beneficiary is the guc mmio communication channel.
As Tvrtko said, for the current GuC communication (guc_send_mmio) we
will need to update fast timeout of
__intel_wait_for_register to 20us. Improvement this patch proposes
through wait_for will
certainly be seen once we switch over to GuC CT. May be specifying "GuC
CT channel" here is apt.
guc mmio comm falls off the fastpath hitting the sleeping wait_for, and
*is* improved by this patch.
Thanks for clarification. I overlooked the sleeping wait of 10ms in
_intel_wait_for_register.
As far as the latency experienced by
gem_exec_latency, there is no difference between a 10us sleep and
spinning for 20us. Changing the spin length to 20us!!! is
something that you should talk to the guc about, at that point we really
need an asynchronous communication channel (ct is still being used
synchronously).
Changing the intel_guc_send_mmio fast timeout is a different
conversation, that is of more dubious merit because of this patch (i.e.
if we can achieve the same latency with a sleep should we spin at all?).
Agree.
Typically we expect (and so spin) for 10us for a quick response, but this
doesn't cover everything and so sometimes we fallback to the millisecond+
sleep. This incurs a significant delay in time-critical operations like
preemption (igt/gem_exec_latency), which can be improved significantly by
using a small sleep after the spin fails.
We've made this suggestion many times, but had little experimental data
to support adding the complexity.
v2: Bump the minimum usleep to 10us on advice of
Documentation/timers/timers-howto.txt (Tvrko)
v3: Specify min, max range for usleep intervals -- some code may
crucially depend upon and so want to specify the sleep pattern.
References: 1758b90e38f5 ("drm/i915: Use a hybrid scheme for fast register waits")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: John Harrison <John.C.Harrison@xxxxxxxxx>
Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
Cc: Ville Syrjala <ville.syrjala@xxxxxxxxxxxxxxx>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@xxxxxxxxx>
---
drivers/gpu/drm/i915/intel_drv.h | 11 +++++++----
drivers/gpu/drm/i915/intel_pm.c | 2 +-
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 635a96fcd788..c00441a3d649 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -48,8 +48,9 @@
* having timed out, since the timeout could be due to preemption or similar and
* we've never had a chance to check the condition before the timeout.
*/
-#define _wait_for(COND, US, W) ({ \
+#define _wait_for(COND, US, Wmin, Wmax) ({ \
unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \
+ long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \
int ret__; \
might_sleep(); \
for (;;) { \
@@ -62,12 +63,14 @@
ret__ = -ETIMEDOUT; \
break; \
} \
- usleep_range((W), (W) * 2); \
+ usleep_range(wait__, wait__ * 2); \
+ if (wait__ < (Wmax)) \
+ wait__ <<= 1; \
I think we need to keep track of total time we have waited else we might
wait for longer than necessary.
This is not a precise wait, it's a sleep with a rough upper bound.
Sleeps by their very nature are very rough, you are giving up the
processor and telling it not to wake you before a certain time. We are
not asking to be woken at that time, just not before.
For e.g. for wait_for_us(COND, 900) this approach might actually lead to
sleep of 1270us.
The numbers are irrelevant, they are merely round numbers from once upon
a time using msleep(1).
Ok. Thanks for clarification.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx