Re: [igt-dev] [PATCH i-g-t 1/2] igt/perf_pmu: Aim for a fixed number of iterations for calibrating accuracy

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 13 Aug 2018 10:20:28 +0100

On 10/08/2018 14:25, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-08-09 12:54:41)

On 08/08/2018 15:59, Chris Wilson wrote:
Our observation is that the systematic error is proportional to the
number of iterations we perform; the suspicion is that it directly
correlates with the number of sleeps. Reduce the number of iterations,
to try and keep the error in check.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   tests/perf_pmu.c | 34 +++++++++++++++++++++-------------
   1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 9a20abb6b..5a26d5272 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -1521,14 +1521,13 @@ static void __rearm_spin_batch(igt_spin_t *spin)
   
   static void
   accuracy(int gem_fd, const struct intel_execution_engine2 *e,
-      unsigned long target_busy_pct)
+      unsigned long target_busy_pct,
+      unsigned long target_iters)
   {
-     unsigned long busy_us = 10000 - 100 * (1 + abs(50 - target_busy_pct));
-     unsigned long idle_us = 100 * (busy_us - target_busy_pct *
-                             busy_us / 100) / target_busy_pct;
       const unsigned long min_test_us = 1e6;
-     const unsigned long pwm_calibration_us = min_test_us;
-     const unsigned long test_us = min_test_us;
+     unsigned long pwm_calibration_us;
+     unsigned long test_us;
+     unsigned long cycle_us, busy_us, idle_us;
       double busy_r, expected;
       uint64_t val[2];
       uint64_t ts[2];
@@ -1538,18 +1537,27 @@ accuracy(int gem_fd, const struct intel_execution_engine2 *e,
       /* Sampling platforms cannot reach the high accuracy criteria. */
       igt_require(gem_has_execlists(gem_fd));
   
-     while (idle_us < 2500) {
+     /* Aim for approximately 100 iterations for calibration */
+     cycle_us = min_test_us / target_iters;
+     busy_us = cycle_us * target_busy_pct / 100;
+     idle_us = cycle_us - busy_us;

2% load, 1s / 10 iters
         cycles_us = 100ms
         busy_us = 2ms
         idle_us = 98ms
...

+
+     while (idle_us < 2500 || busy_us < 2500) {
               busy_us *= 2;
               idle_us *= 2;

...

busy_us = 4ms
idle_us = 196ms

Currently it is 250ms per 98:2 cycle and about 20ms per 50:50 cycle. So
we are only doing 4 and 50 iterations respectively.

10 cycles is strictly an improvement :-p

Hmm indeed. It seems I misremembered how it works. I'll re-read your 
patches.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx