Re: [PATCH i-g-t] scripts/trace.pl: Optimize event parsing and processing

John Harrison <John.C.Harrison@xxxxxxxxx> · Thu, 21 Dec 2017 14:44:32 -0800

On 12/21/2017 1:34 AM, Tvrtko Ursulin wrote:

On 20/12/2017 23:50, John Harrison wrote:
On 12/20/2017 1:54 AM, Tvrtko Ursulin wrote:
What was the effect of all this on your big traces? I am only 
testing with a smaller one which goes from ~3.3s to ~2.2s. On a 
larger trace it might be non-linear gains due to double sort 
avoidance, unless there will be some other effects to cancel that out.


So with a trace of a shortened gem_exec_nop/basic_sequential, the 
'perf script' output is 439MB and the original trace.pl before any of 
the changes took ~180s. After the 'auto-detect field order' patch, it 
went up to ~201s. With the optimisation patch it is down to ~129s.

However, I am also seeing differences in the HTML output since the 
optimisation patch. The differences aren't massive, just slight 
variations in the times. The structure is all the same, its just that 
the accounting and/or time stamps are out.

For example:
     {id: 1, content: 'Ring0 79.48% idle 34.32% 
busy 584.97% runnable 2103.60% queued 16.18% 
wait 200931 batches 331.28us avg batch 331.38us avg 
engine batch '},
vs
     {id: 1, content: 'Ring0 79.48% idle 34.32% 
busy 584.97% runnable 2103.60% queued 16.18% 
wait 200931 batches 338.56us avg batch 338.56us avg 
engine batch '},

Or:
     {id: 58, key: -210383407,  type: 'range', group: 4, subgroup: 2, 
subgroupOrder: 3, content: '428/3 0 
??? ++ 142us 
(0us)', start: '2017-01-05 21:27:45.352968', end: 
'2017-01-05 21:27:45.353110', style: 'color: white; background-color: 
red;'},
vs
     {id: 58, key: -210383407,  type: 'range', group: 4, subgroup: 2, 
subgroupOrder: 3, content: '428/3 0 
??? ++ 159us 
(0us)', start: '2017-01-05 21:27:45.352968', end: 
'2017-01-05 21:27:45.353127', style: 'color: white; background-color: 
red;'},

I can send you the full output if it is useful and the source logs 
too. The HTML output is about 840KB but as noted, the perf logs are 
hundreds of MBs.

I was able to reproduce it. I think it's down to floating point 
mishandling. Neither the original nor the optimized version were 
correct and both are accumulating error. I'll send an updated patch 
shortly.

Regards,

Tvrtko

The new version with +0.5 gives me exactly the same results as the 
previous edition without the rounding. If I get rid of the int() 
conversion too, I still get the same results.

What does make a difference is if I fix up the broken processing of 
notify traces:

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 8dbd497b..20ae3c78 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -382,14 +382,16 @@ while (<>) {

        next if exists $tp{'ring'} and exists $ignore_ring{$tp{'ring'}};

-       if (exists $tp{'ring'} and exists $tp{'ctx'} and exists 
$tp{'seqno'}) {
+       if (exists $tp{'ring'} and exists $tp{'seqno'}) {
                $ring = $tp{'ring'};
-               $ctx = $tp{'ctx'};
                $seqno = $tp{'seqno'};

-               $orig_ctx = $ctx;
-               $ctx = sanitize_ctx($ctx, $ring);
-               $key = db_key($ring, $ctx, $seqno);
+               if (exists $tp{'ctx'}) {
+                       $ctx = $tp{'ctx'};
+                       $orig_ctx = $ctx;
+                       $ctx = sanitize_ctx($ctx, $ring);
+                       $key = db_key($ring, $ctx, $seqno);
+               }
        }

        if ($tp_name eq 'i915:i915_gem_request_wait_begin:') {
@@ -461,7 +463,7 @@ while (<>) {
                }
                $db{$key}->{'duration'} = $db{$key}->{'notify'} - 
$db{$key}->{'start'};
                $db{$key}->{'context-complete-delay'} = 
$db{$key}->{'end'} - $db{$key}->{'notify'};
-       } elsif ($tp_name eq 'intel_engine_notify:') {
+       } elsif ($tp_name eq 'i915:intel_engine_notify:') {
                $notify{global_key($ring, $seqno)} = $time;
        } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
                push @freqs, [$prev_freq_ts, $time, $prev_freq] if 
$prev_freq;

With that, I get exactly the same HTML page as before the optimisation 
patch.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx