Re: [PATCH igt] igt/perf_pmu: Speed up frequency measurement

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Tue, 19 Dec 2017 12:57:54 +0000

On 15/12/2017 21:05, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2017-12-15 18:24:42)

On 15/12/2017 17:05, Chris Wilson wrote:
Use the normal batch_duration_ns and display the sampled frequency:

       Frequency: min=100, max=750, boost=750 MHz
       Min frequency: requested 100.0, actual 100.0
       Max frequency: requested 755.6, actual 755.6

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   tests/perf_pmu.c | 40 +++++++++++++++++++++++++---------------
   1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index d88287c17..61ae96d9a 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -931,9 +931,10 @@ test_interrupts(int gem_fd)
   static void
   test_frequency(int gem_fd)
   {
-     const uint64_t duration_ns = 2e9;

I think I remember why it was this long - because in the early days test
was actually applying load, not modifying frequencies directly, so had
to wait for the frequency to ramp up.

       uint32_t min_freq, max_freq, boost_freq;
-     uint64_t min[2], max[2], start[2];
+     uint64_t val[2], start[2];
+     double min[2], max[2];
+     unsigned long slept;
       igt_spin_t *spin;
       int fd, sysfs;
   
@@ -962,17 +963,19 @@ test_frequency(int gem_fd)
       igt_require(igt_sysfs_set_u32(sysfs, "gt_boost_freq_mhz", min_freq));
       igt_require(igt_sysfs_get_u32(sysfs, "gt_boost_freq_mhz") == min_freq);
   
+     gem_quiescent_gpu(gem_fd); /* Idle to be sure the change takes effect */
+     spin = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
       pmu_read_multi(fd, 2, start);
   
-     spin = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
-     igt_spin_batch_set_timeout(spin, duration_ns);
-     gem_sync(gem_fd, spin->handle);
+     slept = measured_usleep(batch_duration_ns / 1000);

Wouldn't it be more precise to read the pmu at this point?

Yes. That makes sense, I was still in busy mode where we wanted it to
stop being busy simultaneously with the timer ceasing. Thinking maybe
tieing the two together and include the time for gem_busy to report
idle. Hopefully the discrepancy is less than a microsecond at this point.

  
+     igt_spin_batch_end(spin);
   
-     pmu_read_multi(fd, 2, min);
-     min[0] -= start[0];
-     min[1] -= start[1];
+     pmu_read_multi(fd, 2, val);
+     min[0] = 1e9*(val[0] - start[0]) / slept;
+     min[1] = 1e9*(val[1] - start[1]) / slept;
   
       igt_spin_batch_free(gem_fd, spin);
+     gem_quiescent_gpu(gem_fd); /* Don't leak busy bo into the next phase */
   
       usleep(1e6);
   
@@ -987,17 +990,19 @@ test_frequency(int gem_fd)
       igt_require(igt_sysfs_set_u32(sysfs, "gt_min_freq_mhz", max_freq));
       igt_require(igt_sysfs_get_u32(sysfs, "gt_min_freq_mhz") == max_freq);
   
+     gem_quiescent_gpu(gem_fd);
+     spin = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
       pmu_read_multi(fd, 2, start);
   
-     spin = igt_spin_batch_new(gem_fd, 0, I915_EXEC_RENDER, 0);
-     igt_spin_batch_set_timeout(spin, duration_ns);
-     gem_sync(gem_fd, spin->handle);
+     slept = measured_usleep(batch_duration_ns / 1000);
+     igt_spin_batch_end(spin);
   
-     pmu_read_multi(fd, 2, max);
-     max[0] -= start[0];
-     max[1] -= start[1];
+     pmu_read_multi(fd, 2, val);
+     max[0] = 1e9*(val[0] - start[0]) / slept;
+     max[1] = 1e9*(val[1] - start[1]) / slept;

I was also thinking that we should
assert_within_epsilon(max[0], ref, 5), i.e. that the reported average
frequency is what we expected. That would be better than asserting that
the actual frequency was less than the requested (who knows what the
future holds).

Sounds ok.

       igt_spin_batch_free(gem_fd, spin);
+     gem_quiescent_gpu(gem_fd);

Ah I see.. only for the spin batch. Why not then gem_sync or maybe we
should add igt_spin_batch_free_sync?

gem_quiescent_gpu goes one step further than gem_sync and says the system is
idle / parked afterwards. Which is often quite important

Yes, seems like I'm repeating this pattern often enough that throwing it
into igt_spin_batch is worthwhile. Also I want to include a spin_batch
variant that guarantees it has started executing before returning.
Sadly will require MI_STORE_DWORD so limit it's availability. I think
I'll wait for the spin_batch options to land before adding more
parameters.

Not sure what is the status. I'd be OK with either gem_quiescent_gpu at 
the start of subtests, or making igt_spin_batch_free ensure batch 
finished. That should also stop all leaks between tests AFAICT.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx