Re: [PATCH i-g-t] scripts/trace.pl: Optimize event parsing and processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 19/12/2017 16:23, John Harrison wrote:
On 12/19/2017 3:15 AM, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

A couple of small optimizations which altogether bring around 30%
improvement in my testing.

1. Do less string processing on tracepoints names and push more of the
    check into the if-ladder.

2. Pull out common db key and ctx processing and cache common values in
    local vars.

3. Key value pair parsing is faster with a regexp.

4. Avoid sorting the db hash multiple times if possible.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: John Harrison <John.C.Harrison@xxxxxxxxx>
---
  scripts/trace.pl | 114 ++++++++++++++++++++++++-------------------------------
  1 file changed, 49 insertions(+), 65 deletions(-)
---
John please check if this helps with your monster traces before we decide
if it is worth putting this in. Well, there isn't much of a downside
except touching something which works and risking breaking it.

diff --git a/scripts/trace.pl b/scripts/trace.pl
index 98e4a9843a43..5463e1fd1954 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -356,110 +356,91 @@ my $prev_freq = 0;
  my $prev_freq_ts = 0;
  while (<>) {
      my @fields;
-    my @tmp;
      my $tp_name;
-    my $time;
      my %tp;
-    my $key;
+    my ($time, $ctx, $ring, $seqno, $orig_ctx, $key);
      chomp;
      @fields = split ' ';
+    chop $fields[3];
+    $time = int($fields[3] * 1000000.0);
+
      $tp_name = $fields[4];
-    @tmp = split ':', $tp_name, 2;
-    next unless $tmp[0] eq 'i915';
-    $tp_name = $tmp[1];
-    chop $tp_name;
-    chop $fields[3];
-    $time = $fields[3] * 1000000.0;
      splice @fields, 0, 5;
      foreach my $f (@fields) {
-        my @kv = split '=|,', $f;
-
-        $kv[0] = 'global' if $kv[0] eq 'global_seqno';
+        my ($k, $v);
-        $tp{$kv[0]} = $kv[1];
+        ($k, $v) = ($f =~ /^(\S+)=(\d+)/);
I recall something about using () selection is slower than using pre-defined values. E.g.
   $f =~ m/=/; ($k, $v) = ($`, $');

Or do you specifically need the filtering of only matching numeric values?

It doesn't matter at the moment, all the script touches just happens to be numeric so either way works. I did not know of this trick and it is indeed marginally faster. So I've sent an update with it included.

What was the effect of all this on your big traces? I am only testing with a smaller one which goes from ~3.3s to ~2.2s. On a larger trace it might be non-linear gains due to double sort avoidance, unless there will be some other effects to cancel that out.

+        next unless defined $k;
+        $k = 'global' if $k eq 'global_seqno';
+        $tp{$k} = $v;
      }
      next if exists $tp{'ring'} and exists $ignore_ring{$tp{'ring'}};
-    if ($tp_name eq 'i915_gem_request_wait_begin') {
-        my %rw;
+    if (exists $tp{'ring'} and exists $tp{'ctx'} and exists $tp{'seqno'}) {
+        $ring = $tp{'ring'};
+        $ctx = $tp{'ctx'};
+        $seqno = $tp{'seqno'};
+
+        $orig_ctx = $ctx;
+        $ctx = sanitize_ctx($ctx, $ring);
+        $key = db_key($ring, $ctx, $seqno);
+    }
-        $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-        $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
+    if ($tp_name eq 'i915:i915_gem_request_wait_begin:') {
+        my %rw;
          next if exists $reqwait{$key};
          $rw{'key'} = $key;
-        $rw{'ring'} = $tp{'ring'};
-        $rw{'seqno'} = $tp{'seqno'};
-        $rw{'ctx'} = $tp{'ctx'};
+        $rw{'ring'} = $ring;
+        $rw{'seqno'} = $seqno;
+        $rw{'ctx'} = $ctx;
          $rw{'start'} = $time;
          $reqwait{$key} = \%rw;
-        next;
-    } elsif ($tp_name eq 'i915_gem_request_wait_end') {
-        $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-        $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
-
+    } elsif ($tp_name eq 'i915:i915_gem_request_wait_end:') {
          next unless exists $reqwait{$key};
          $reqwait{$key}->{'end'} = $time;
-        next;
-    } elsif ($tp_name eq 'i915_gem_request_add') {
-        my $orig_ctx = $tp{'ctx'};
-
-        $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-        $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
-
+    } elsif ($tp_name eq 'i915:i915_gem_request_add:') {
          if (exists $queue{$key}) {
              $ctxdb{$orig_ctx}++;
-            $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-            $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
+            $ctx = sanitize_ctx($ctx, $ring);
This bit originally worked on $orig_ctx not $ctx (i.e., before your previous refactoring patch). Which should it be? Will a double sanitisation step work as intended? I don't actually hit this line in any of the runs I've done.

Yeah well spotted, copy and paste error and fix sent. R-b would be cool on it.

This only triggers if you trace something which creates and destroys contexts due context id recycling. Here it auto-detects an impossible condition and deduces it must be a new context, so it assigns a new suffix to it.

Regards,

Tvrtko


+            $key = db_key($ring, $ctx, $seqno);
          }
          $queue{$key} = $time;
-        next;
-    } elsif ($tp_name eq 'i915_gem_request_submit') {
-        $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-        $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
-
+    } elsif ($tp_name eq 'i915:i915_gem_request_submit:') {
          die if exists $submit{$key};
          die unless exists $queue{$key};
          $submit{$key} = $time;
-        next;
-    } elsif ($tp_name eq 'i915_gem_request_in') {
+    } elsif ($tp_name eq 'i915:i915_gem_request_in:') {
          my %req;
-        $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-        $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
-
          die if exists $db{$key};
          die unless exists $queue{$key};
          die unless exists $submit{$key};
          $req{'start'} = $time;
-        $req{'ring'} = $tp{'ring'};
-        $req{'seqno'} = $tp{'seqno'};
-        $req{'ctx'} = $tp{'ctx'};
-        $req{'name'} = $tp{'ctx'} . '/' . $tp{'seqno'};
+        $req{'ring'} = $ring;
+        $req{'seqno'} = $seqno;
+        $req{'ctx'} = $ctx;
+        $req{'name'} = $ctx . '/' . $seqno;
          $req{'global'} = $tp{'global'};
          $req{'port'} = $tp{'port'};
          $req{'queue'} = $queue{$key};
          $req{'submit-delay'} = $submit{$key} - $queue{$key};
          $req{'execute-delay'} = $req{'start'} - $submit{$key};
-        $rings{$tp{'ring'}} = $gid++ unless exists $rings{$tp{'ring'}};
-        $ringmap{$rings{$tp{'ring'}}} = $tp{'ring'};
+        $rings{$ring} = $gid++ unless exists $rings{$ring};
+        $ringmap{$rings{$ring}} = $ring;
          $db{$key} = \%req;
-        next;
-    } elsif ($tp_name eq 'i915_gem_request_out') {
-        my $gkey = global_key($tp{'ring'}, $tp{'global'});
-
-        $tp{'ctx'} = sanitize_ctx($tp{'ctx'}, $tp{'ring'});
-        $key = db_key($tp{'ring'}, $tp{'ctx'}, $tp{'seqno'});
+    } elsif ($tp_name eq 'i915:i915_gem_request_out:') {
+        my $gkey = global_key($ring, $tp{'global'});
          die unless exists $db{$key};
          die unless exists $db{$key}->{'start'};
@@ -476,15 +457,12 @@ while (<>) {
          }
          $db{$key}->{'duration'} = $db{$key}->{'notify'} - $db{$key}->{'start'};           $db{$key}->{'context-complete-delay'} = $db{$key}->{'end'} - $db{$key}->{'notify'};
-        next;
-    } elsif ($tp_name eq 'intel_engine_notify') {
-        $notify{global_key($tp{'ring'}, $tp{'seqno'})} = $time;
-        next;
-    } elsif ($tp_name eq 'intel_gpu_freq_change') {
+    } elsif ($tp_name eq 'intel_engine_notify:') {
+        $notify{global_key($ring, $seqno)} = $time;
+    } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
          push @freqs, [$prev_freq_ts, $time, $prev_freq] if $prev_freq;
          $prev_freq_ts = $time;
          $prev_freq = $tp{'new_freq'};
-        next;
      }
  }
@@ -557,7 +535,10 @@ my (%submit_avg, %execute_avg, %ctxsave_avg);
  my $last_ts = 0;
  my $first_ts;
-foreach my $key (sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db) { +my @sorted_keys = sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db;
+my $re_sort = 0;
+
+foreach my $key (@sorted_keys) {
      my $ring = $db{$key}->{'ring'};
      my $end = $db{$key}->{'end'};
@@ -585,6 +566,7 @@ foreach my $key (sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db) {
          # 20us tolerance
          if (exists $db{$next_key} and $db{$next_key}->{'start'} < $start + 20) {
+            $re_sort = 1;
              $db{$next_key}->{'start'} = $start + $db{$key}->{'duration'};               $db{$next_key}->{'start'} = $db{$next_key}->{'end'} if $db{$next_key}->{'start'} > $db{$next_key}->{'end'};               $db{$next_key}->{'duration'} = $db{$next_key}->{'notify'} - $db{$next_key}->{'start'}; @@ -601,6 +583,8 @@ foreach my $key (sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db) {
      $ctxsave_avg{$ring} += $db{$key}->{'end'} - $db{$key}->{'notify'};
  }
+@sorted_keys = sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db if $re_sort;
+
  foreach my $ring (keys %batch_avg) {
      $batch_avg{$ring} /= $batch_count{$ring};
      $batch_total_avg{$ring} /= $batch_count{$ring};
@@ -616,7 +600,7 @@ foreach my $gid (sort keys %rings) {
      my (@s_, @e_);
      # Extract all GPU busy intervals and sort them.
-    foreach my $key (sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db) {
+    foreach my $key (@sorted_keys) {
          next unless $db{$key}->{'ring'} == $ring;
          push @s_, $db{$key}->{'start'};
          push @e_, $db{$key}->{'end'};
@@ -660,7 +644,7 @@ my @gpu_intervals;
  my (@s_, @e_);
  # Extract all GPU busy intervals and sort them.
-foreach my $key (sort {$db{$a}->{'start'} <=> $db{$b}->{'start'}} keys %db) {
+foreach my $key (@sorted_keys) {
      push @s_, $db{$key}->{'start'};
      push @e_, $db{$key}->{'end'};
      die if $db{$key}->{'start'} > $db{$key}->{'end'};

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux