On 25 September 2017 at 18:08, Jeff King <peff@xxxxxxxx> wrote: > On Sun, Sep 24, 2017 at 09:59:28PM +0200, Martin Ågren wrote: > >> > I'm not sure of the best way to count things. >> > > But at least on the topic of "how many unique leaks are there", I wrote > the script below to try to give some basic answers. It just finds the > first non-boring entry in each stack trace and reports that. Where > "boring" is really "this function is not expected to free, but hands off > memory ownership to somebody else". Thanks. I combined your script with this: -- >8 -- #!/usr/bin/perl -w # Extract the stacktraces and identify them # by their SHA hashes (these identifiers are # not guaranteed to be stable across # re-compilations of the Git binaries). use Digest::SHA qw(sha1 sha1_hex); my $ctx = Digest::SHA->new("SHA-1"); my $stage = 0; while (<>) { my $collect = 0; if ($stage == 0 && /irect leak of \d+ byte.*allocated from:$/) { $stage++; $collect = 1; } elsif($stage == 1 && /^\s*\#\d+\s+/) { $collect = 1; } elsif ($stage == 1 && /^\s*$/) { $digest = $ctx->hexdigest; printf "Stacktrace-hash: %s\n", $digest; $ctx = Digest::SHA->new("SHA-1"); $stage = 0; } elsif ($stage == 1) { print "warning: unidentified string '$_'\n"; } if ($collect) { $ctx->add_bits($_); print; } } -- >8 -- Then I report various ad-hoc metrics: -- >8 -- #!/bin/bash for d in "$@" do echo $d echo -n " direct leaks: " grep "Direct leak" "$d"/* | wc -l echo -n " indirect leaks: " grep "Indirect leak" "$d"/* | wc -l echo -n " allocating places: " perl leaks.pl "$d"/* | sort -u | wc -l echo -n " most common allocating place: " perl leaks.pl "$d"/* | sort \ | uniq -c | sort -nr | head -1 | awk '{print $1;}' echo -n " size of leak-reports: " cat "$d"/* | wc -l echo -n " unique leaking stacktraces: " perl extract-traces.pl "$d"/* | grep "Stacktrace-hash" | sort -u | wc -l echo -n " most common stacktrace: " perl extract-traces.pl "$d"/* | grep "Stacktrace-hash" | sort \ | uniq -c | sort -nr | head -1 | awk '{print $1;}' done -- >8 -- If PIDs of leaking processes collide, reports are lost. Something like this as root helps: `echo 4194303 > /proc/sys/kernel/pid_max` Still, the numbers vary for back-to-back runs. Here are two runs on master and two runs on master plus the lockfile-patches I just sent. (I don't run all tests.) lsan_ea220ee4 direct leaks: 127165 indirect leaks: 83897 allocating places: 504 most common allocating place: 10212 size of leak-reports: 3662204 unique leaking stacktraces: 83265 most common stacktrace: 55 lsan_ea220ee4-rerun direct leaks: 127172 indirect leaks: 83903 allocating places: 504 most common allocating place: 10212 size of leak-reports: 3662334 unique leaking stacktraces: 83644 most common stacktrace: 57 lsan_ea220ee4+lockfile_fixes direct leaks: 118678 indirect leaks: 83908 allocating places: 493 most common allocating place: 10212 size of leak-reports: 3545563 unique leaking stacktraces: 99834 most common stacktrace: 32 lsan_ea220ee4+lockfile_fixes-rerun direct leaks: 118678 indirect leaks: 83902 allocating places: 491 most common allocating place: 10212 size of leak-reports: 3545463 unique leaking stacktraces: 82171 most common stacktrace: 40 > So I don't know how useful any of that will be, but it at least should > give _some_ metric that should be diminishing as we fix leaks. Indeed. Martin